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Abstract. A key tool in recent advances in understanding arithmetic progres- 
sions and other patterns in subsets of the integers is certain norms or semi- 
norms. One example is the norms on X/NZ introduced by Gowers in his proof 
of Szemeredi's Theorem, used to detect uniformity of subsets of the integers. 
Another example is the seminorms on bounded functions in a measure preserv- 
ing system (associated to the averages in Furstenberg's proof of Szemeredi's 
Theorem) defined by the authors. For each integer k > 1, we define seminorms 
on analogous to these norms and seminorms. We study the correlation 

of these norms with certain algebraically defined sequences, which arise from 
evaluating a continuous function on the homogeneous space of a nilpotent Lie 
group on a orbit (the nilsequences). Using these seminorms, we define a dual 
norm that acts as an upper bound for the correlation of a bounded sequence 
with a nilsequence. We also prove an inverse theorem for the seminorms, show- 
ing how a bounded sequence correlates with a nilsequence. As applications, we 
derive several ergodic theoretic results, including a nilsequence version of the 
Wiener- Wintner ergodic theorem, a nil version of a corollary to the spectral 
theorem, and a weighted multiple ergodic convergence theorem. 



1. Introduction 

1.1. Norms and seminorms. In his proof of Szemeredi's Theorem, Gowers [G] 
introduced norms for functions defined on Z/7VZ that count parallelepiped configu- 
rations and can be used to detect certain patterns (such as arithmetic progressions) 
in subsets of the integers. In |HK1| . we defined seminorms on bounded measurable 
functions on a measure preserving system, that can be viewed as averages over par- 
allelepipeds and use them to control the norm of multiple ergodic averages (such 
as one evaluated along arithmetic progressions). Although the original definitions 
were quite different, it turns out that the Gowers norms and the ergodic seminorms 
are almost the same object, but are defined on different spaces: one on the space of 
functions on Z/NZ and the other on the space of bounded functions on a measure 
space. We used the ergodic seminorms to define factors of a measure space, and 
then showed that these factors have algebraic structure. This algebraic structure is 
the main ingredient in proving convergence of multiple ergodic averages along arith- 
metic progressions, and along other sequences. Gowers norms have since been used 
in other contexts, including the proof of Green and Tao |GT1| that the primes con- 
tain arbitrarily long arithmetic progressions. The connection between nilsystems 
in ergodic theory and the algebraic nature of analogous combinatorial objects has 
yet to be fully understood. The beginning of this is carried out by Green and Tao 
(see |GT2j . |GT3j and |GT4j ). including an inverse theorem for the third Gowers 
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In this article, we define related seminorms on bounded sequences and prove 
a structure theorem and an inverse theorem for it. We also give some ergodic 
theoretic applications of these constructions. These applications include a version 
of the Wiener- Wintner ergodic theorem extended to nilsequences, a spectral type 
theorem for nilsequences, and a weighted ergodic theorem. Polynomial versions of 
these results are contained in a forthcoming article. All these properties depend 
on the connection to algebraic structures and we describe these structures more 
precisely. 

1.2. Nilsystems and nilsequences. In the inverse and structure theorems de- 
scribed above, a key role is played by algebraic objects, the nilsystems: 

Definition 1.1. Assume that G is a fc-step nilpotent Lie group and P C G is a 
discrete, cocompact subgroup of G. The compact manifold X = G/T is called a 
k-step nilmanifold. The Haar measure fi of X is the unique probability measure 
invariant under the action x <— > g.x of G on X by left translations. Letting T 
denote left multiplication by the fixed element r G G, we call {X, /i,T) a k-step 
nilmanifol$\. 

Loosely speaking, the Structure Theorem of |HKlj states that if one wants to 
understand the multiple ergodic averages 



where k > 1 is an integer, (A, /i, T) is a measure preserving system, and fi, ■ ■ ■ , fk G 
L°°{fx), one can replace each function by its conditional expectation on some nilsys- 
tem. Thus one can reduce the problem to studying the same average in a nilsystem, 
reducing averaging in an arbitrary system to a more tractable question. 
A related problem is study of the multicorrelation sequence 



where fc > 1 is an integer, (X, /i, T) is a measure preserving system, and / G 

In [BHKj . we defined sequences that arise from nilsystems (the nilsequences) and 

show that a multicorrelation sequence can be decomposed into a sequence that is 

small in terms of density and a fc-step nilsequence. We define this second term 

precisely: 

Definition 1.2. Let (A, /i,T) be a fc-step nilsystem, /: X — > C a continuous 
function, t G G, and xq G X. The sequence (f(r n xo). n G Z) is a basic fc- 
step nilsequence. If, in addition, the function / is smooth, then the sequence 
(/(r n xo) : n G Z) is called a smooth k-step nilsequence. A k-step nilsequence is 
a uniform limit of basic fc-step nilsequences. 

The family of fc-step nilsequences forms a closed, shift invariant subalgebra of se- 
quences in £°°(Z). One step nilsequences are exactly the almost periodic sequences. 
An example of a 2-step nilsequence is the sequence (exp(7rm(n — l)a) : n G Z), 
where a lies in the torus T = R/Z. (The collection of all 2-step nilsequences is 
described fully and classified in [HK2].) 

1 X is endowed with its Borel cr-algebra X. In general, we omit the associated cr-algebra 
from our notation, writing (X, fi, T) for a measure preserving probability system rather than 
(X, X,ji, T). We implicitly assume that all measure preserving systems are probability systems. 
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1.3. Direct theorems and inverse theorems. We define a new seminoma on 
bounded sequences and use this seminorm, an associated dual norm, and nilse- 
quences to derive direct and inverse theorems. These seminorms on £°°(Z) arise 
via an averaging process, and there is more than one natural way to take such an 
average. The first is looking along a particular sequence of intervals of integers 
whose lengths tend to infinity, and taking the average over these intervals. This 
corresponds, in some sense, to a local point of view, as such an averaging scheme 
does not take into account what happens outside this particular sequence of inter- 
vals. A second way to take an average is to allow all choices of intervals. This 
uniform point of view gives us further information on the original sequence. 

Averaging in Z, the first version gives rise to the classic notion of density, taking 
the proportion of a set relative to the sequence of intervals [1, ... ,7V], while the 
second gives rise to the slightly different notion of Banach density, where the density 
is computed relative to any sequence of intervals whose lengths tend to infinity. 
Each type of averaging gives rise for each integer k > 1 to some sort of uniformity 
measurement (seminorm, norm, or a version thereof) on bounded sequences. 

We use the seminorms associate to each of these averaging methods to address 
analogs of combinatorial results. A classical problem in combinatorics is to start 
with a finite set A of integers (for example) and say something about properties of 
sets that can be built from A, such as the sumset A + A or product set A ■ A. Such 
results are referred to as direct theorems. Inverse theorems start with the sumset, 
product set, or other information derived from a finite set, and then try to deduce 
information about the set itself. 

We prove both a direct theorem and an inverse theorem. For the direct theorem, 
we show that there is a dual norm that acts as an upper bound on the correlation of 
a bounded sequence with a nilsequence. We also prove an inverse theorem for the 
seminorms, showing how a bounded sequence correlates with a nilsequence. This 
is an £°° version of the Gowers Inverse Conjecture made by Green and Tao [GT3]. 
This conjecture was resolved by them for the third Gowers norm in |GT4j . 

Using the direct theorems, we derive a weighted multiple ergodic convergence 
theorem. We believe that one should be able to use these methods to derive other 
combinatorial results. 

The tools used in this paper have several sources. One is a version of the Fursten- 
berg Correspondence Principle (see [F]), used to translate the problems into ergodic 
theoretic statements. Another is the connection of the seminorms we define with 
the algebraic structure of nilsystems, using properties of the ergodic seminorms 
developed in HK1 . Throughout, we use some harmonic analysis on nilmanifolds. 

This article can be viewed as an ergodic perspective on the development of a 
"higher order Fourier analysis" that has been proposed by Green and Tao [GT3]. 
Our direct results develop harmonic analysis relative to the standard Fourier ana- 
lytic methods and our local inverse results lend support to Green- Tao conjecture 
of an inverse theorem for the Gowers norms. 

1.4. Organization of the paper. In the next section, we define the seminorms 
on £°°(Z) and give their basic properties. We then state the main results first for 
k = 2 and then for general k, with the intention of clarifying the objects under 
study. Section [3] gives the background on ergodic seminorms and nilsystems. In 
Section IU we give a presentation of the Correspondence Principle that allows us to 
prove the properties of the £°°(Z) seminorms introduced in Section O In Section [5j 
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we study the dual norm associated to these seminorms and use it to prove the direct 
theorems on the seminorms. We prove the inverse theorems in Section [6l using an 
extension of the Correspondence Principle and in Section [7] we give some ergodic 
theoretic consequences of these results. Throughout we make use of the connection 
with the ergodic seminorms. 

2. Summary of the results 

We introduce seminorms on £°°(Z) corresponding to the Gowers norms [G] in 
the finite setting and to the seminorms in ergodic theory introduced in [HK1]. We 
begin with some definitions and statements of the main properties. After defining 
the relevant seminorms, we give the statements of the results, beginning with the 
sample case of k = 2. 

Notation. We write sequences as a = (a n : n £ Z) and we write the uniform norm 
of this sequence as ||a||oo. 

By an interval, we mean an interval in Z. If I is an interval, |I| denotes its 
length. 

We write z i— > Cz for complex conjugation in C. Thus C k z = z if k is an even 
integer and C k z — z if k is an odd integer. 

For every k > 1, points of Z fe are written h = (hi, . . . , hk). For e = (ei, . . . , e&) £ 
{0, l} k and h = (hi, ...,h k )e Z k , we define 

|e| = ei + . . . + efc and e • h — ei ■ hi + . . . + e fe • h k . 

Further notation on averages of sequences of intervals is given at the end of this 
Section. 

2.1. The local "seminorms" and the uniformity seminorms on l°°(Z). We 
define two quantities that are measurements on bounded sequences. The proofs 
rely on material from a variety of sources (summarized in Section [3]) and some 
machinery that we develop, and so we postpone them until Section 01 In fact, 
some of the properties stated in this section can be proved via direct computations. 
However, we prefer proofs relying on the Furstenberg correspondence principle, as 
we use a modification of this principle to prove stronger results. 

We introduce the property that allows us to define certain "seminorms." 

Definition 2.1. Let k > 1 be an integer, a = (a n : n e Z) be a bounded sequence, 
and I = (Ij : j > 1) be a sequence of intervals whose lengths tend to infinity. We 
say that the sequence a satisfies property V(k) on I if for all h = (hi, . . . , hk) £ Z fc , 
the limit 

lim TT C^a n+h . t 

1 31 nelj ee{0,l} fc 

exists. We denote this limit by Ch(l, a). 

Given a bounded sequence a and a sequence of intervals whose lengths tend to 
infinity, one can always pass to a subsequence on which a satisfies V(k). 

Proposition 2.2. Let k > 1 be an integer, I = (Ij : j > 1) be a sequence of 
intervals whose lengths tend to infinity, and let & be a bounded sequence satisfying 
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property V(k) on I. Then then limit 




exists and is non-negative. 

Using this proposition, we define: 

Definition 2.3. For an integer k > 1, a sequence of intervals I = (Ij : j > 1), and 
a bounded sequence a satisfying property V{k) on I, define 



We call ||-||i,fc a local "seminorm" (with quotes on the word seminorm), because 
the space of sequences satisfying property V(k) on I is not a vector space. On the 
other hand, we do have: 

Proposition 2.4. Assume that k > 1 is an integer, a and b are bounded sequences, 
and I is a sequence of intervals whose lengths tend to infinity. If a, b and a + b 
satisfy property V{k) on 1, then ||a+b|| I)fe < \\a\\\,k + V°h,k- 

The "seminorms" are also non-increasing with k: 

Proposition 2.5. // the bounded sequence a satisfies properties V(k) and V(k + 1) 
on the sequence of intervals 1, then HaHi^ < 1 1 a|| i,fc+i • 

We use the "seminorm" to define a measure of uniformity (a uniformity semi- 
norm) on bounded sequences: 

Definition 2.6. Let a = (a n : n e Z) be a bounded sequence and let k > 1 be 
an integer. We define the k-uniformity seminorm \\a.\\u^) to be the supremum of 
|| a l|i.fcj where the supremum is taken over all sequences of intervals I on which a 
satisfies property "P(fc)- 

Using Proposition ^. 41 by passing, if necessary, to subsequences of the sequences 
of intervals, we immediately deduce: 

Proposition 2.7. For every integer k > 2, [Hlc/ffe) * s a seminorm on £°°(Z,). 
2.2. Comments on the definitions. 

2.2.1. The definitions of Hajli^- and |[a||[/(fe) are very similar to those of the Gowers 
norms introduced in [G] in the finite setting (meaning, for sequences indexed by 
Z/NZ). In the sequel, we establish analogs of properties of Gowers norms for the 
£°°(Z) seminorms. The £°°(Z) seminorms are also close relatives of the ergodic 
seminorms of HKfJ. In the sequel we show that this resemblance is not merely 
formal; the link between the £°°(Z) seminorms and the ergodic seminorms is a 
basic tool of this paper. 

2.2.2. ft can be shown that in Proposition 12.21 the averages on [0, H — l] k can be 
replaced by averages on any sequence of "rectangles" (Ih,i x ■ • ■ In,k '■ H > 1), where 
Ihj is an interval for every j G {f , . . . , k} and every H and mm, \Ihj\ ~ * +°° as 
H — > +oo; more generally we could also average over any F0lner sequence in Z fc . 




h t ,...,h k =0 
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2.2.3. For clarity, we explain what the definitions mean when k = 1. (We discuss 
k = 2 in the next section.) Let a = (a n : n 6 Z) be a bounded sequence and let 
I = (Jj : j > 1) be a sequence of intervals whose lengths tend to infinity. 
Property V(l) says that for every /igZ, the averages 

TF7 a n a n+h 

converge as j — > +oo and the definition of ||a||i ; i is 

1 1 
||a||i,i = ( lim — y~] lim — V" a n a^I 

Furthermore, 

||a||i,i > limsup — ^ aj 

i — > + oc \J-ii 



and 



\u(i) = „ lim sup 



J 1 riG/j 



^ M+N-l 

N ^ ' 

n=M 



The first property follows easily from the van der Corput Lemma (see Appendix lA"]) 
and probably the second can also be proved directly. Both properties also follow 
from the discussion in Section [4721 



2.2.4. The difference between the local "seminorms" and the uniformity seminorms 
is best illustrated by considering a randomly generated sequence. Let a = (a n : n £ 
Z) be a random sequence, where the a n are independent random variables, taking 
the values +1 and —1 each with probability 1/2. Let I = (Ij : j > 1) be a sequence 
of intervals whose lengths tend to infinity. Then for every integer k, the sequence a 
satisfies property V(k) on I almost surely and ||a||i * = 0. On the other hand, we 
have that ||a||[/(/.) = 1 almost surely. Indeed, for every integer j > 1 there exists an 
interval Ij of length j on which the sequence a is constant and equal to 1 ; taking 
I to be this sequence of intervals, we have that ||a||i * = 1 for every integer k > 1. 
The apparent contradiction only arises because of the choice of uncountably many 
sequences of intervals. 

2.2.5. There are nontrivial bounded sequences for which the uniformity seminorm 
is 0. This is illustrated by the following particular case of Corollary [3TTJ 

Proposition 2.8. Let k > 1 be an integer and assume that {X,T) is a uniquely 
ergodic system with invariant measure fi) that is weakly mixing. If f is a function 
on X with J f dfj, = 0, then for every x G X, the sequence ( f(T n x): n G Z) has 
k-uniformity seminorm. 

2.3. The case k = 2. To further clarify the statements, we explain some of our 
general results in the particular case that k = 2. These results are prototypes for 
the general case, but are simpler to state and prove. Most of these results can be 
proved without resorting to any significant machinery and we include one of the 
simpler proofs here. 

Notation. We write T = M/Z. For t e T, e(f) = exp(2?r^). 
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The first result explains the role of the local "seminorm" , namely that it acts as 
an upper bound: 

Proposition 2.9. If a. = (a„: n G Z) is a bounded sequence satisfying V(2) on the 
sequence of intervals I = (/,• : j > 1), then 



limsup sup 

j^+oo tGT 



\T.\ E °" 



e(nt) < ||a||i, 2 . 



Proof. We can assume that ||a||oo < 1- By the van der Corput Lemma (Appen- 
dix EJ, Cauchy-Schwartz Inequality, and another application of the van der Corput 
Lemma, we have that for all integers j,H > 1, and all t G T, 



tt-t E a « e ("0 



< 



/cH 



H 



h=-H 
H 



H 2 
I! - \h\ 



TjT ^2 a n a n+ h 



< 



H 



H 



E E 

»=-H h=-H 



H — \£\ H — \h\ 



H 2 



H 2 



where c, c', c" are universal constants. Taking the limit as j — > +oo first (recall that 
the sequence a satisfies P(2) on the sequence of intervals I), and then as H — > +oo, 
we have the announced result. □ 

We use this to show how such a sequence a correlates with almost periodic 
sequences. First a definition: 

Definition 2.10. A sequence of the form (e(nt) : n G Z) is called a complex expo- 
nential sequence. A sequence is a trigonometric polynomial if it is a finite linear 
combination of complex exponential sequences. An almost periodic sequence is a 
uniform limit of trigonometric polynomials. 



By approximation, it follows immediately from Proposition 12.91 that: 

Corollary 2.11. Let b = (b n : n G Z) 6e an almost periodic sequence. Then for 
every S > 0, f/iere exists a constant c — c(b, S) such that if a bounded sequence 
a = (a n : n G Z) satisfies property V(2) on a sequence of intervals I = (7j : j > 1), 
then 



lim sup 

j^+oo 



I/,- 1 ? a " 



<c a 



1.2 



+ <%|| 



For some almost periodic sequences we have more precise bounds. A smooth 
almost periodic sequence b = (6„ : n G Z) (that is, a smooth 1-step nilsequence) 
can be written as 

oo 

bn = ^2 ^rne(nt m ) , 
m— 1 

where i m , m > 1, are distinct elements of T and A m G C, to > 1, satisfy 

oo 

ia,„i < +oo . 
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Wc define 



oo 



3/4 



IN; = (]T |A m | 4 / 3 



) 



m— 1 



and we have that: 

Proposition 2.12. Let a = (a n : n € Z) 6e a bounded sequence satisfying property 
V{2) on the sequence of intervals I = (Jj : j > 1) and b = (b n : n G Z) &e a smooth 
almost periodic sequence. Then, 



The constant |||b|||2 here is the best possible. Undoubtedly, one could prove this 
result without resorting to special machinery, but we do not attempt this method 
as this is a particular case of a general result (Theorem l2.13p . In fact we show that 
the norm ||| • ||| acts as the dual of the seminorm ||-||{7(2)- 

2.4. Main results. Let k > 2 be an integer. In section [5731 for every (k — l)-step 
nilmanifold X we define a norm ||| • |||£ on the space C°°(X) of smooth functions 
on X. We defer the precise definition, as it requires development of some further 
background. Let b be a smooth (A; — l)-step nilsequence. Then there exists an 
ergodic (k — l)-step nilsystem (Corollary 13. 3|) . a smooth function / on x, and a 
point xq G X with 



The same sequence b can be represented in this way in several manners, with 
different systems, different starting points, and different functions, but we show 
(Corollarv l5.8p that all associated functions / have the same norm ||| • Therefore 
we can define |||b|£ = |||/|||£ where / is any of the possible functions. 

2.4.1. Direct results. Using this norm, we have generalizations of the results already 
given for k — 2: 

Theorem 2.13 (Direct Theorem). Let a = (a n : n G Z) be a bounded sequence 
that satisfies property V(k) on the sequence of intervals I = (Jj : j > 1). For all 
(k — l)-step smooth nilsequences b, we have 



By density, Theorem 1 2 . 1 31 immediately implies: 

Corollary 2.14. Leth = (b n : n G Z) be a (k—l)-step nilsequence and 5 > 0. There 
exists a constant c = c(b, 5) such that for every bounded sequence a = (a n : n G Z) 
satisfying property V{k) on a sequence of intervals I = (Lj : j > 1) , we have 




b n = f(T n x ) for every neZ. 





Using these results, we immediately deduce uniform versions: 
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Corollary 2.15. Let b = (b n : n G Z) &e a smooth (k — \)-step nilsequence and 
a = (a„ : n £ Z) &e a bounded sequence. Then 

N+M-l 



N 



lim sup 



n=M 



< hhw Ml 



Let b = (b n : n G Z) be a (k — l)-step nilsequence and let S > 0. There exists a 
constant c = c(b, #) swc/i £/iai /or every bounded sequence a = (a„ : n 6 Z) , 

JV+Af-l 



lim sup 

AT->+oo MgZ 



— a n b n < c\\a\\ u(k) + S\\a\\ 



71= M 



We refer to these results as direct results, meaning that we start with a sequence 
and derive its correlation with nilsequences. One can view them as upper bounds, 
because they give an upper bound between the correlation of a sequence with a 
nilsequence. 

2.4.2. Inverse results. The next results are in the opposite direction of the direct 
results of the previous section, and we refer to them as "inverse results" . 

Theorem 2.16 (Inverse Theorem). Let a = (o„: n G Z) be a bounded sequence. 
Then for every 8 > 0, there exists a (k— l)-step smooth nilsequence b = (b n : n G Z) 
such that 

M+N-l 



WHl = 1 and lim sup — V a n b n 

N-,+00 MeZ 1\ 



>Mu(k)-5 



Summarizing this theorem and Corollary 12. 151 we have 
Corollary 2.17. For every bounded sequence a = (a n : n G Z), 



\u(k) = SU P lim SU P 

b=(6„) is a smooth N->+tx> MeZ 
nilscqucncc and |||b|||^ — 1 



N+M-l 
n=M 



This means that we can view the norm ||| • |j£ as the dual norm of the uniformity 
seminorm 

Corollary 2.18. For a bounded sequence a = (a n : n G Z), the following properties 
are equivalent: 

(i) \\ a \\u(k) = 0. 

^ N+M-l 

(ii) lim sup — > a n b n — /or every (k — l)-step smooth nilsequence 
b = (b n : n G Z). 

- AT+M-l 

(iii) lim sup — N a„6 n = /or every (k — l)-step nilsequence b = 

n— M 

(i»:n6Z). 

For fc = 2, Corollary 12. 181 Proposition 12.91 and a density argument imply that 
the three equivalent conditions of Corollary 12 . 181 are also equivalent to 

N+M-l 



(iv) For every t G T, lim sup — a n e(nt) 

N^+oo MeZ N ' — ' 



-M 



0. 
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M+N-l 

(v) lim sup sup — > a n e(nt) 

N-r+oo t£T M6Z iV ^— ' 

n— iw 



= 0. 



2.4.3. j4 counterexample. It is important to note that the inverse results have no 
version involving local "seminorms" and we give here an example illustrating this 
point. 

Let (Nj : j > 1) be an increasing sequence of integers with Ni = and tending 
sufficiently fast to +oo. For j > 1 let Ij = [Nj, Nj+i — 1] and let I = (Ij : j > 1). 
Let the sequence a be defined by a n = e(n/j) if Nj < \n\ < Nj+i. Then ||a||i,2 = 1 
and for every t G T, the average of a n e(nt) on the interval Ij converges to zero as 
j — > +oo. Therefore, for every almost periodic sequence b, the average of a n b n on 
Ij also converges to zero. 

This highlights a difference between the finite case, where the norms are defined 
on Z/iVZ, and the infinite case. One can not construct such a sequence where the 
behavior worsens as one tends to infinity. 

2.5. A condition for convergence. 

Theorem 2.19. For a bounded sequence a = (a n : n G Z), the following are equiv- 
alent. 

(i) For every S > 0, the sequence a can be written as a' + a" where a' is a 
(k — l)-step nilsequence and |]a"||(7(/fe) < 5. 

(ii) For every {k — \^j-step nilsequence c — (c n ; n G Z), the averages of a n c n 
converge, meaning that the limit 

lim — a n c n 

j^+oo \I \ ^ 
1 J 1 nelj 

exists for every sequence (Ij: j > 1) of intervals whose lengths tend to 
infinity. 

In Proposition 17. li we give a method to build sequences satisfying the (equiva- 
lent) properties of Theorem 12. 191 checking that the sequences verify the first prop- 
erty. As this proposition uses material not yet defined, we do not state it here but 
only give two examples of its application. 

A generalized polynomial is defined to be a real valued function that is obtained 
from the identity function and real constants by using (in arbitrary order) the 
operations of addition, multiplication, and taking the integer part. We have: 

Proposition 2.20. Let p be a generalized polynomial and for every n S Z, let 
{p(n)} be the fractional part of p(n). Then the sequences ({p(n)}: n 6 Z) and 
(e(p(n)): n S Z) satisfy the (equivalent) properties of Theorem \2. 191 

The Thue-Morse sequence a = (a n : n G Z) is given by a n = 1 if the sum of the 
digits of |n| written in base 2 is odd and a n = otherwise. In Section [7.2l we show: 

Proposition 2.21. The Thue-Morse sequence satisfies the properties of Theo- 
rem\KM 

A similar method can be used for other sequences, for example for all sequences 
associated to primitive substitutions of constant length (sec | Q for the definition) . 

2.6. An application to ergodic theory. 
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2.6.1. We recall a classical result in ergodic theory. 

Theorem ( Wiener- Wintner ergodic theorem [WW]). Let (X,/i,T) be an ergodic 
system and <f> G L°°(/x). Then there exists Xq C X with /u(-Xq) = 1 such that 



-Y,J>(T n x)e(nt) 



n=0 

converges for every x G Xq and every t 6 T. 

The important point here is that the set Xq does not depend on the choice of t. 
We also recall an immediate corollary of the spectral theorem: 

Corollary (of the spectral theorem). Let a = [a n : n G Z) be a bounded sequence 
and assume that 

N-l 

lim > a„e(nt) 

exists for every t G T. T/ien /or every system (Y,u,S) and every f G L 2 {v), the 
averages 

N-l 

converge in L 2 (y) as N — > +oo. 

Putting these two results together, we have: 

Corollary. Assume that [X,fji, T) is an ergodic system and 4> G L°°(fi). There 
exists Xq C X with fi(Xo) = 1 such that for every x G Xq, every system (Y, v, S), 
and every f G L 2 (v), the averages 

N-l 

- £ 4>{T n x)S n f 



N 

n=0 



converge in L 2 (/j,) as N 



The strength of this result is that the set Xq does not depend either on Y or 
on /. We say that for every x G Xq, the sequence {<p(T n x)) is a universally good 
for the convergence in mean of ergodic averages. In fact, for almost every x, this 
sequence is also universally good for the almost everywhere convergence |BFKOj . 
but we do not address this strengthening here. 

2.6.2. We generalize these results for multiple ergodic averages. We start with a 
generalization of the Wiener- Wintner Theorem, where we can replace the exponen- 
tial sequence e(nt) by an arbitrary nilsequence. 

Theorem 2.22 (A generalized Wiener- Wintner Theorem). Let (X,/j,,T) be an 
ergodic system and <fi be a bounded measurable function on X. Then there exists 
Xq G X with fi(Xo) — 1 such that for every x G Xq, the averages 

N-l 

-J2^T n x)b n 

n=0 

converge as N — -> +oo for every x G Xq and every nilsequence b = (b n : n G Z). 
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We give a sample application. Generalized polynomials were defined in Sec- 
tion E3 

Corollary 2.23. Let (X,/j,,T) be an ergodic system, <f> be a bounded measurable 
function on X, and Xq be the subset of X introduced in Theorem \2.22\ Then for 
every x G Xq and every generalized polynomial p, the averages 

AT-l AT-l 

- HT n x){p(n)} and - £ 0(T»x)e(p(n)) 

n=0 n=0 

converge. 

(Recall that {p{n)} denotes the fractional part of p(n).) For standard polynomial 
sequences, this result was proven by Lesigne |Les2j . 

We next have a version of the spectral result for higher order nilsequences: 

Theorem 2.24 (A substitute for the corollary of the Spectral Theorem). Let k > 1 

be an integer and a = (a n : n G Z) be a bounded sequence such that the averages 

N ^ a,lK 

n=0 

converge as N — > +oo for every k-step nilsequence b = (b n : n G Z). Then for every 
system (Y, v, S) and every fi, ■ ■ fk G L°°(v), the averages 

N-l 

(1) jj ^a n S n h.S 2n f 2 .--- .S kn f k 

n=0 

converge in L 2 (v). 

Combining these theorems, we immediately deduce: 

Theorem 2.25. Let (X,/i,T) be an ergodic system and 4> G L oc (fi). Then there 
exists Xq C X with h(Xq) = 1 such that for every xq G X, every system (Y, v, S), 
every integer k > 1, and all functions fi,---,fk G L°°(y), the averages 

n=0 

converge in L 2 (u) as N — > +oo. 

In short, for every x G Xq, the sequence ((f>(T n x)) is universally good for the 
convergence in mean of multiple ergodic averages. 

While Theorems 12.221 and 12.241 are results about nilsequences, nilsequences do 
not appear in the statement of Theorem l2.25l they occur only as tools in the proof, 
playing the role of complex exponentials in the classical results. 

By successively using Theorems 12.191 and 12.241 we obtain further examples of 
universally good sequences for the convergence in mean of multiple ergodic averages. 
For example, by Proposition 12 . 20l for every generalized polynomial p the sequence 
({p(n)}: n G Z) is a universally good sequence for the convergence in mean of 
multiple ergodic averages, as is the sequence (e(p(n)) : n G Z). By Proposition ^. 21[ 
so is the Thue-Morse sequence. 
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2.7. Some notation for averages. In this paper we continuously take limits of 
averages on sequences of intervals. Writing the cumbersome formulas or replacing 
them by long explanations would make the paper unreadable and so we introduce 
some short notation. However, we continue using explicit formulas in the main 
statements. 

We have several different notions of averaging for a sequence in £°°(Z): over a 
particular sequence of intervals or uniformly over all intervals. 
For averaging over a particular sequence of intervals, we define: 

Definition 2.26. Let a = (a n : n 6 Z) be a bounded sequence and let I = (Ij : j > 

1) be a sequence of intervals whose lengths \Ij\ tend to infinity. Define 

limsup |averages I (a„)| = limsup 

j— >+oo 



The averages of the sequence a on I converge if the limit 

lim -!- V a„ 

exists. We denote this limit by lim averages! (a„) and call this the average over I 
of the sequence a. 

For taking a uniform average, we define: 

Definition 2.27. Let a = (a n : n e Z) be a bounded sequence. The upper limit of 
the averages of the sequence a is defined to be 



limsup |averages(a„) | = lim sup 

N^+ca MgZ 



M+N-l 



(Note that this limit exists by subadditivity.) 

The averages of the sequence a converge if the limit lim averages I (o„) exists for 
all sequences of intervals I = (Ij : j > 1) whose lengths \Ij\ tend to infinity. We 
denote this (common) limit by lim averages (a„) and call this the uniform average 
of the sequence a. 



Assuming the existence of the uniform average, it follows that 

= . 



lim sup 

JV->+oo MeZ 



^ M+N-l 

lim averages (a„) — — ^ a r . 

n=M 



3. Some tools 
3.1. Nilmanifolds and nilsystems. 

3.1.1. The definitions. Short definitions were given in the introduction and we re- 
peat them here in a more complete form. 

Let G be a group. For g,h <G G, we write [g, h] — ghg~ l h~ x for the commutator 
of g and h and we write [A, B] for the subgroup spanned by {[a, b] : a G A, b £ B}. 
The commutator subgroups Gj, j > 1, are defined inductively by setting G\ = G 
and Gj + i = [Gj, G). Let k > 1 be an integer. We say that G is k-step nilpotent if 
Gk+i is the trivial subgroup. 

Let G be a fc-step nilpotent Lie group and T a discrete cocompact subgroup of 
G. The compact manifold X — G/T is called a k-step nilmanifold. The group G 
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acts on A by left translations and we write this action as (g, x) i— > g.x. The Haar 
measure fi of A is the unique probability measure on X invariant under this action. 

Let r G G and T be the transformation x i— > t.£E of X. Then (X, T, /i) is called 
a k-step nilsystem. When the measure is not needed for results, we omit and write 
that (X, T) is a fc-step nilsystem. 

Nilsystems are distal topological dynamical systems. This means that, if dx is 
a distance on X defining its topology, then for every x,x' G X , 

if x ^ x', then inf d x {T n y,T n y') > . 

Let / be a continuous (respectively, smooth) function on X and xq G X. The se- 
quence (f(T n xa) : n G Z) is called a frasic (respectively, smooth) k-step nilsequence. 
A fc-step nilsequence is a uniform limit of basic fc-step nilsequences. Therefore, 
smooth fc-step nilsequences are dense in the space of all fc-step nilsequences under 
the uniform norm. 

The Cartesian product of two fc-step nilsystems is again a fc-step nilsystem. It 
follows that the space of fc-step nilsequences is an algebra under pointwise addition 
and multiplication. Moreover, this algebra is invariant under the shift. 

As an example, 1-step nilsystems are translations on compact abelian Lie groups 
and 1-step nilsequences are exactly almost periodic sequences. For examples of 
2-step nilsystems and a detailed study of 2-step nilsequences, see |HK2] . 

A general reference on nilsystems is [AGHj and the results summarized in the 
next few sections are contained in the literature. See, for example |Leslj and [Leij . 

3.1.2. Ergodicity. 

Theorem 3.1. Let k > 1 be an integer. For a k-step nilsystem (X = G/T,T) with 
Haar measure /i, the following properties are equivalent: 

(i) (X, T) is transitive, meaning that it admits a dense orbit. 

(ii) (A, T) is minimal, meaning that every orbit is dense. 

(iii) (A, T) is uniquely ergodic. 

(iv) (A, fi,T) is ergodic. 

When these properties are satisfied, we say that the system is ergodic, even in 
statements of topological nature (that is, without mention of the measure). 

Theorem 3.2. Let k > 1 be an integer, (A = G/T,T) be a k-step nilsystem where 
T is the translation by r G G. Let xq G A and let Y be the closed orbit of xq, 
meaning that Y is the closure of the orbit {T n XQ : n G Z}. Then (Y,T) is a k-step 
nilsystem. More precisely, there exist a closed subgroup G' of G containing t, such 
that V = T n G' is cocompact in G' and Y = G'/V. 

If (f(T n xo) ■ n G Z) is a basic (respectively, smooth) nilsequence, by substituting 
the closed orbit of x$ for A, we deduce: 

Corollary 3.3. For every basic (respectively, smooth) k-step nilsequence a = 
(a n : n G Z), there exists an ergodic k-step nilsystem (X,T), Xq G X, and a contin- 
uous (respectively, smooth) function f on X with a n = f(T n xo) for every n G Z. 

Corollary 3.4. Let a = (a n : n G Z) be a nilsequence. Then the averages of a 
converge. 
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Proof. By density, we can restrict to the case that a is a basic nilsequence, and we 
write it as in Corollarv l3.3l By unique ergodicity of (X, T), the averages converge 
to J f dfx, where /i is the Haar measure of X. □ 

3.1.3. A criteria for ergodicity. 

Theorem 3.5. Let k > 1 be an integer, (X = G/T,T) be a k-step nilsystem, and 
assume that T is translation by r £ G. Assume that 

(*) The group G is spanned by the connected component Go of its unit and 
by t. 

Then (X, T) is ergodic if and only if the translation induced by r on the compact 
abelian group Z = G/G2T is ergodic. 

Conversely, let (X = G/T, T) be an ergodic nilsystem where T is the translation 
by t £ G. Let G\ be the subgroup spanned by Go and r and set Ti = r fl G\. 
Then Gi is an open subgroup of G, T\ is a discrete cocompact subgroup of Gi, 
and by ergodicity, the image of Gi in X under the natural projection is onto. We 
can therefore identify X with G\/T\. Thus we can assume that hypothesis (*) of 
Theorem 13.51 is satisfied. Throughout this paper, we implicitly assume that this 
hypothesis holds. 

3.1.4. The case of several commuting transformations. Let X = G/T be a nilman- 
ifold and let t\,...,ti be commuting elements of G. For 1 < i < I let Ti : X — > X 
be the translation by Tj . Then the results of Section 13.1.21 still hold, modulo the 
obvious changes. We do not give the modified statements here, with the exception 
of Theorem 13.51 

Theorem 3.6. Let X = G/T be a nilmanifold, ti,...,T£ be commuting elements 
of G, and for 1 < i < £ let Ti : X — ► X be the translation by Ti. Assume that: 

(**) The group G is spanned by the connected component Go of its unit and 
by ti, ...,T£. 

Then X is ergodic under the action o/Xi, T%, . . . , Tg_ if and only if the action induced 
by these transformations on the compact abelian group Z = G/G2T is ergodic. 

3.2. The measures /jJ fc l and HK-seminorms. 

In the rest of this section we consider arbitrary ergodic systems and we assume 
that k > 1 is an integer. We review the construction and properties of certain 
objects on X 2 defined in HK1|. 

3.2.1. Some notation. We introduce some notation to keep track of the 2 fe copies 
of X. If X is a set, we write X^ = X 2 and index these copies of X by {0, l} fe . 
An element of X^ is written as 

x = (x e :e£{0,l} k ) . 

We recall that for e £ {0, l} fe and h £ Z fe , we write |e| = ei H + e/- and e • h = 

eihi H h e k h k . 

We write the element with all 0's of {0, l} k as = (0, 0, . . . , 0). We often give 
the 0-th coordinate of a point of X^ a distinguished role and we write 

X^ = X x Xl k] , where x[ k] = X 2 ^ 1 . 
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\k] 

The coordinates of XI are indexed by the set 

{0,l}i' = {0,l} fc \{0} 

and a point of X^ is often written 

x = {x ,x*) , where x^ = (x e : e e {0, 1} J . 

When (X, /x, T) is a measure preserving system, we also have notation for some 
transformations that are naturally defined on X^. Namely, we write T[ fe ] for the 
transformation T x T x . . . x T, taken 2 k times. Moreover, if i 6 {1, . . . , k}, we 
define 

{T^x) t = if£i = 1 

I x e otherwise . 

For convenience, we also write X^ = X and T^l = T. 

3.2.2. Measures and HK-seminorms. Throughout the rest of this section, (X, [i, T) 
denotes an ergodic system. 

By induction, for every integer k > we define a measure fi^ on JfW that 
is invariant under T^. We set (j)® = fj,. For k > 1, making the natural identi- 
fication of X^ w ith X^^ 1 } x X[ fe_1 ], we write x = [x',x") for a point of X^, 
with x',x" G X[ fe_1 l. Let Z[ fc -i] denote the invariant cr-algebra of the system 
r[ fc_1 l). We define fj\ k ^ to be the relatively independent joining of 
ptl fe -i] with itself over , meaning that if F, G are bounded functions on X^- k ~^ , 

then 

F(x')G{x")dfj,^(x) = f E(F | T[ fc_1 l)(y) • E(G | X [fc " 1] )(y) <W fe_1] (y) . 

By induction, all the marginals of /J- k ^ (that is, the images of this measure under 
the natural projections X^ — » X) are equal to /i. 

Since (X^°\ /J°\T^) = (X,fi,T) is ergodic, is the trivial cr-algebra and 
/iW = fi x fi. But for fe > 2 the system (X^ k ~^ , fi^ 1 ^ ,T^ k ^) is not necessarily 
ergodic and fJ- k ^ is not in general the product measure. 

For k > 1 and every / £ L°°(^), 



= / |k( n ^'/wK^ofv^^^o 

and so we can define the HK-seminorm 

\m\k=(j n c<H/(x £ )V fei fe)) 1/21 ■ 

To avoid ambiguities when several measures are present, we sometimes write l/l^.fe 
inste ad of ||| /ffl fc . 

In [HKlj . we show that ||| • \\k is a seminorm on L°°(/i). These seminorms satisfy 
an inequality similar to the Cauchy-Schwartz-Gowers inequality for Gowers norms. 
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Namely, let / e , e £ {0, l} fc , be 2 k bounded functions on X. Then 

(2) |/ n M*e)d^(x)\< n 

ee{0,l} fc e€{0,l} fc 

We also have that consecutive HK-seminorms satisfy l/lfc+i > |||/|||fc, and by an 
application of the ergodic theorem, 

(3) |||/||U +1 = lim (l^lllTV'/lf' 1 2 

h=0 



Using the definition and the fact that the marginals of //M are equal to fi, we 
have that for all / £ L 2 (//), 



(4) l/lk. < \\f\ 



In fact, the definition of the seminorm ||| • |||fc can be extended to L 2 (/S) with the 
same properties. 

3.3. Convergence results. 

3.3.1. Averaging along parallelepipeds. These seminorms and a geometric descrip- 
tion of the factors they define are used to show: 

Theorem 3.7 ( [HKIj . Theorem 13.1). Let f £ , e £ {0,1}* be 2 k - 1 functions in 
I/°°(/Lt). Then the averages 

i e n *"V. 

hi,...,h k =0 ee {0,l}J 

converge in L 2 (/j,) and the limit g of these averages is characterized by 
fhgdfi= fh(x ) J] f e (x e )dfi [k] (x) 



eG{0,l}£ 



for every h £ L°°(//). 



In fact, we could replace the averages on [0, H — l\ k by averages over any F0lner 
sequence in Z fe . Applying Theorem 13.71 to the case that / £ = C' £ '/ for every e, we 
obtain: 

Corollary 3.8. For every f £ L°°(n), the averages 

(5) 5J E II cMfcr+z) 

fti,...,/ifc=0ee{0,l}* 

converge in L 2 ([i) as H — > +oo. 

This leads us to a definition: 

Definition 3.9. We denote the limit of ([5]) by T>kf and call this function the dual 
function of f. 
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It follows that the dual function T>kf satisfies: 

(6) fv k f.hdn= fh(x ) H C^f(x e )d^(x) 

e£{o,i}* 

for every h G L°°(/i). In particular, we have 

(7) in/if = /%/./*=fei e / n c^ h fdn. 

The notion of a dual function is implicit in [HK1 and this notation is not used 
there. However, the notation is coherent with that used in several papers of Green 
and Tao, where similar functions (in the finite setting) are called dual functions. 

The definition extends to functions in L 2k (fi), for which we use the same nota- 
tion. Indeed, by ((2]), (01, and density, for / G L 2> ° (n) the convergence (0 holds in 
L 2 k /(2 k -i)^. the limit f unct i on x> k f belongs to L 2 " / (2 " (p) with 

||£>fc/|| L 2fc/( 2 fc-l) (Al) < 11/11^2^) 

and formula (J6j) holds for every h G L 2 (/i). Moreover, T>k is a continuous map 
fromi 2 ^) toL 2fc /(2 fc -D( / ,). 

3.3.2. Application to sequences. Let / be a bounded function on X. We consider 
the quantities associated to the bounded sequence (f(T n x) : n G Z) for a generic 
point x of X, as in Section [2.1l ^From the definition of the ergodic seminorms, the 
pointwise ergodic theorem, and ((?]), we immediately deduce: 

Corollary 3.10. Let k > 2 be an integer and let I be the sequence of intervals 
([0, N — 1] : N > 1). Let (X,fj,,T) be an ergodic system and let f E L°°{p). Then 
for almost every x £ X , the sequence (f(T n x) : n € Z) satisfies property V{k) on I 
and 

(8) IK/(r"!c):neZ)[[i )fc =|/|* . 

Corollary 3.11. Let k > 2 be an integer, let (X,T) be a uniquely ergodic system 
with invariant measure fi, and let f be a Riemann integrable function on X. Then 
for every x G X and every sequence of intervals I whose lengths tend to infinity, 
the sequence (f(T n x) : n G Z) satisfies property P(k) on I and equality ([5]) holds. 
In particular, for every 

||(/(T"x):nGZ)|| c/(fc) = |||/||| fe . 

Proof. The hypothesis means that for every S > there exists two continuous 
functions g,g' on X with g < / < g' and J(g' — g) d[i < S. This implies that 
for every h G Z fe the function in the last integral of formula (JT]) is also Riemann 
integrable. Therefore the ergodic averages of this function converge everywhere to 
its integral. □ 

3.4. The structure Theorem. We use the following version of the Structure 
Theorem of }HKlj . which is a combination of statements in Lemma 4.3, Definition 
4.10 and Theorem 10.1 of that paper. 

Theorem (Structure Theorem). Let (X,fj,,T) be an ergodic system. Then for 
every k > 2 there exists a system (Z^, /ifc,T) and a factor map tt^ : X — > Zk with 
the following properties: 
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(i) (Zk, Hk,T) is the inverse limit of a sequence of (k — \)-step nilsystems. 

(ii) For every function f € L°°(fi), \\f - E(/ | Z k ) o 7r fc | fe = . 

Since |||/|||fc+i > |||/|||fe for every / G L°°(}i), the factors are nested: Z^ is a 
factor of Zk+i- 

We use this theorem via the following immediate corollary. 

Corollary 3.12. Let(X,fi,T) be an ergodic system and f G L x '(/z) . Then for every 
S > 0, there exists a {k — l)-step ergodic nilsystem (Y,S,v), a (measure theoretic) 
factor map p: X — » Y , and a continuous function h on Y with |||/ — h o p\\^ k < S. 

4. The correspondence principle and the "seminorms" 

4.1. The classic Correspondence Principle. In translating Szemeredi's Theo- 
rem into a problem in ergodic theory, Furstenberg introduced the Correspondence 
Principle in [F]. We give a not completely classical presentation of this principle, 
which is amenable to modification in the sequel. 

By a separable subalgebra of ^°°(Z), we mean a unitary subalgebra of ^°°(Z), 
invariant under the shift and under complex conjugation, closed in ^°°(Z) and 
separable for the uniform norm written ||| • loo. In the sequel, we mostly consider 
the case of the separable subalgebra .4(a) spanned by a bounded sequence a = 
(a n : n G Z). 

We write a for the shift on (Z) , and thus for a sequence a = (a n : n G Z) , era 
denotes the sequence (a n +i : n £ Z), We use a to denote the conjugate sequence 
(a: n 6 Z). In the sequel, A denotes a separable subalgebra of £°°(Z). 

4.1.1. The pointed dynamical system associated to an algebra. Let X be the Gelfand 
spectrum of ^4, meaning X consists of the set of unitary homomorphisms from A to 
the complex numbers. Letting C(X) denote the algebra of continuous functions on 
X, we have that there exists an isometric isomorphism of algebras C(X) — » A. 
For b £ A, the function <1> -I (b) is called the function associated to b. 

Since A is separable, X is a compact metric space. We write dx for a distance 
on X defining its topology. 

The map b h- > bo is a character of the algebra A. Thus there exists a point xo E X 
with f(xo) — &(f)o for all / G C(X). The shift on 4 induces a homeomorphism 
T: X -v X with $(/ o T) = $(/) o cr for all / G C(X). Therefore, for every 
/ G C(X), $(/) is the sequence 

$(/) = (/(n ):«eZ) . 

In particular, if / G C(X) satisfies f(T n xo) — for all n G Z, then the sequence 
given by ^(b) = / is identically zero and so / itself is identically zero. It follows 
that the point xq of X is transitive, meaning that its orbit {T n xo : n G Z} is dense 
in X. 

We encapsulate this construction in the following definition: 

Definition 4.1. The triple (X, T, xq) is called the pointed topological dynamical 
system associated to the algebra A. 

4.1.2. Averaging schemes and invariant measures. We first introduce a definition 
that allows us to average any sequence in a subalgebra over a sequence of intervals: 
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Definition 4.2. Let A be a separable subalgebra of £°°(Z) and I = (ij : j > 1) be 
a sequence of intervals whose lengths tend to infinity. We say that I is an averaging 
scheme for A if the limit 



lim averagesj(b) := lim — !— b 

j^ + ao \IA 



1 1 nelj 



exists for all b G A. 



Since .4 is separable with respect to the norm of l°° (Z) , for every sequence of 
intervals whose lengths tend to infinity, we can always pass to a subsequence that is 
an averaging scheme for A. The classical case is when I is taken to be the sequence 
([0, j — 1] : j > 1), or some subsequence of this sequence. 

Given an averaging scheme I for A, we can associate an invariant probability 
measure fj, on X defined by: 

(9) ( fdn = lim average^ (f(T n x )) := lim -L V f(T n x ) 

for all/eC(X). 

We claim that all ergodic invariant probability measures on X are obtained by 
this procedure. Namely, let [i be such a measure. Let x\ G X be a generic point 
for meaning that for all / G C(X), 

3-1 



lim !^/(rn)= /"/d/x. 



(By the ergodic theorem, ^-almost every point x x G X is generic.) Since x is a 
transitive point, there exists a sequence (kj : j > 1) of integers such that 

sup d x (T k i +n x ,T n x 1 ) -> as j -> +oo . 

0<n<j 

So for any continuous function / on X, we then have 

Let I be the sequence of intervals (Ij = [kj, kj + j — 1] : j > 1). If b G .4 and / is 
the associated function on X, we have 

lim averages^) = lim averages, (/(T".o)) = j f d, . 

Therefore the sequence of intervals I is an averaging scheme for A corresponding 
to the measure /i, and the claim follows. 

4.2. Proofs of properties of the "seminorms". We use this presentation of the 
Correspondence Principle to derive the properties of the "seminorms." We start 
with the non-negativity that makes the definition possible. Recall that the bounded 
sequence a = (a n : n G Z) satisfies property V{k) on the sequence of intervals I if 
for all h = (h\, . . . , hk) G Z fc , the limit 

c h (I,a) = lim averages^ C |e| a„ +e . /l ) 

eG{0,l} fc 
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exists. We show that for a sequence a satisfying this, the limit 

ff-i 



lim — r cijfl, ; 



h 1 ,...,h k =Q 

exists and is non-negative: 



Proof of Proposition \2.2l Let k > 2 be an integer and a = (a n : n £ Z) be a 
bounded sequence that satisfies property V(k) on a sequence of intervals I. Let 
A = -4(a), (X,T,xo) be the pointed topological dynamical system associated to 
the algebra A, and / 6 C(X) be the function associated to the sequence a. Starting 
with the sequence of intervals I, by passing to a subsequence J, we extract an 
averaging scheme for A. Let /x be the associated measure on X. For every h £ Z fe , 
we have 

(10) c h (I,a) =c h (J,a) = / [] C^/Cr-V)^) . 

ee{o,i}* 

Let 

M = / cLP(w) 

be the ergodic decomposition of the measure /i. The integral (fT0|) can be rewritten 



as 



eG{0,l} fc 

By Theorem 

H-l 



/(/ n ^i/^^d/i.^dFH 



^ £ ^,1)= j\\f\C, h dP{u>) 



hm 

fti,...,fcfc=0 



Therefore, the announced limit exists and is non-negative and we have the state- 
ment. □ 

Maintaining notation used in the proof, we note that: 

l/2 fc 



(ii) iiaiii, fc = (yi/iL, fc ^M y 

We now prove the versions of subadditivity that are satisfied by the "seminorms" : 



Proof of Propositions and Wlh Assume that the bounded sequence a satisfies 
properties V(k) and V(k + 1) on the sequence of intervals I. By (fTTj) , the Cauchy- 
Schwartz inequality, and equality ([3]), we have 

l|a||f,r < / < / l/lC+i^M = Ha||&+i • 

Thus 1 1 £l| | i T fc < ||a||i.fc+i and Proposition 12.51 follows . 

Now assume that a and b are bounded sequences and assume that the three 
sequences a, b, and a + b satisfy property V(k) for some sequence of intervals I. 
We proceed as in the proof of Proposition ^. 21 taking A to be the algebra spanned 
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by a and b. If / and g are the functions on X associated respectively to a and b, 
we have that 

IM& = J\tfit,kdP(u); ||b||f fe = jMt,kdP(u); 
h + Ht = Jl\f + 9l\t,kdP(u) ■ 

Therefore 

||a + b||i, fc < ||a|| I)fe + ||b|| Iifc 
and Proposition 12.41 follows. □ 

4.3. A Cauchy-Schwartz-Gowers type result. We have an inequality similar 
to that satisfied by the Gowers norms in the finite setting and by the HK-seminorms, 
as given in ©: 

Proposition 4.3. For every e £ {0, l} fc , let a(e) = (a n (e): n E Z) be a bounded 
sequence. Let 1 be a sequence of intervals whose lengths tend to infinity such that 



Ch ■= lim :77 V TT a n+e . h 

J nelj e6{0,l}'= 



exists for every h G Z fe . Then the limit 



H-l 



um 77T y c l 

hi,...,h k =0 

exists. 

Moreover, if all the sequences a(e) satisfy property V(k) on I, then 

1 

(12) lim -j: ]T c h < II ■ 

/ii,...,/i k =0 cG{0,l}' ,i 

Proof. The proof of the convergence is similar to the proof of Proposition ^. 21 but we 
set A to be the algebra spanned by the 2 fc sequences a(e), e 6 {0, l} k . Maintaining 
notation as that proof, for every e € {0, l} fc we let f e denote the function associated 
to the sequence a(e). It follows from inequality ((2|) that 



^ ° h \ = \f([ n fe(Xc)d^(x))dP(u 

I n i/a^,fcdPH< n (/i/eC )fe ^)) 1/2fc = n 



</ 11 |/ 6 |^,fc«iP(w) < 11 (/ lll/ e |^, fc dPMj = 11 ||a(e)||i, fc . 

£6{0,l}' t ee{o,i} fc ee{o,i} fc 

□ 

Using relations ([3]) and (flTj) . we deduce that: 

Proposition 4.4. Assume that the bounded sequence a satisfies property V(k + 1) 
on I. TTien 



lira 1 £ ll^a-allJi = Nlg^ 



ft=0 

h„ 77 



Note that the hypothesis implies that for every integer h > 1, the sequence er a.a 
satisfies property "P(fc) on I. 
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4.4. The uniformity seminorms. We also use the Correspondence Principle to 
derive properties of the uniformity seminorms: 

Proposition 4.5. Let k > 1 be an integer, a. be a bounded sequence, [X,T,Xq) the 
associated pointed dynamical system, and f G C(X) be the function associated to a. 
Then 

\Hu(k) = sup ll/lll^ , 

fj, ergodic 

where the supremum is taken over all ergodic measures fi on X . 

Proof. It follows from (fTTjl that if we raise the left hand side to the power 2 k , 
then it is bounded by the right hand side raised to the power 2 k . Conversely, in 
Section 14.1.21 we showed that every ergodic measure fi on X is associated to an 
averaging scheme I for the algebra A(&). By applying (fTTj) again, we have that 
= ||a||i, fc < llall^. ' □ 

Proposition ^. 7l follows immediately; it could also be derived directly from Propo- 
sition 12.41 

Remark 4.1. We note that there are important differences between the uniformity 
seminorms and the HK-seminorms. For example, the formula given by Proposi- 
tion 14.41 comes from, and is similar to, formula for the HK-seminorms. We 
deduce that 

1 

\\a\\u(k+i) < |minf - J2 \\a~° ha \\ 2 u(k) ■ 

^ h=0 

But in general, the liminf on the right hand side of this equation is not a limit and 
equality does not hold. 

5. A DUALITY IN NILMANIFOLDS AND DIRECT RESULTS 

5.1. Measures and norms for nilsystems. Throughout this section, we assume 
that k > 2 is an integer and (X — G/T,fi,T) is an ergodic (k — l)-step nilsystem, 
where T is the translation by r £ G. As explained in Section [3J we reduce to the 
case that G is spanned by its connected component Go of the identity and by r. 

We review properties of the measure /j} k ^ and of the seminorm ||| • |||^ in this 
particular case. Most of these properties are established in [HKlj or |GT2| . but 
often in a very different context and with very different terminology from that used 
here. We include some proofs for completeness, but as they are far from the main 
topics of the article, we defer them to Appendix [BJ This appendix also includes 
some properties we need that are not stated elsewhere. 

We use the notation for 2 fe -Cartesian powers introduced in Section [3J We sum- 
marize the properties that we need: 

Theorem 5.1. 

(i) The measure /j)-^ is the Haar measure of a sub-nilmanifold Xf. of X^ . 
The transformations T^'I an( [ T^ k \ 1 < i < k, act on X^ by translation 
and Xk is ergodic (and thus uniquely ergodic and minimal) under these 
transformations. 

(ii) Let X^ be the image of Xk under the projection x \— > from X^ to 
x[ k ^ = X 2 -1 . There exists a smooth map <£>: X^ — > Xk such that 

X k = {($(£*),£*): x e X k „} . 
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(iii) I • \\k is a norm on C(X). 

(iv) For every x G X, let Wk, x = {x £ Xk - x o = x}. Then Wk, x is uniquely 
ergodic under the transformations T\ k \ 1 < i < k. 

(v) For every x € X, let p x be the invariant measure of Wk,x- Then for 
every x G X and g G G, p g . x is the image of p x under the translation by 
gW = (g,g,...,g). 

The nilmanifold Xk is defined independently of the transformation T and it only 
depends on the structure of the nilmanifold X. This implies that the measure p^ 
and the norm ||| ■ \\k do not depend on the transformation T on X, provided that T 
is an ergodic transformation. These are geometric, and not dynamical, objects. 

5.2. Uniform convergence. Using part ([Iv]) of Theorem 15. II we deduce: 

Corollary 5.2. Let f c , e G {0, 1}£ be 2 k — 1 continuous functions on X. For every 
x *E X we have 



ga e n uT eh x)^ [ n M**)dp*® 

l- i, n . s- r r\ ill, ** fA 1 1 fc 



hi,...,hli=Oee{0,l}* e6{0,l}i 

as H — > +oo. Moreover, the convergence is uniform in x £ X . 

Proof. The corollary follows easily from part JIv| of Theorem 15.11 by a classical 
argument. Let (xj : j > 1) be a sequence in X converging to some a; G X and let 
(-Hj : j > 1) be a sequence of integers tending to infinity. 
For every j, let i/j be the measure 

on .X^ and let v be any weak limit of this sequence of measures. For every j , the 
measure Vj is concentrated on Wk )Xj - Since Xk is closed in X^ k \ the measure v 
is concentrated on W^. Moreover, for every j and for 1 < i < k, the difference 

between the measures Vj and T^Vj are at a distance < 2/i/j in the norm of total 

\k] 

variation. It follows that v is invariant under T\ for i — l,...,fc. By unique 
ergodicity of Wk, x , we have that v is the invariant measure p x of Wk, x - 

We have shown that the sequence {vj : j > 1) of measures converges weakly to 
the measure p x . It follows that if f e , e G {0, 1}*?, are continuous functions on X, 
then 

Hi-l 



Ik e n MT th x 3 )= f n / e w^fe) 

/ II M x e)dp x (x) 



J /ii,...,fe fc =o ee{o,i}i " ee{o,i}!; 



eG{0,l}J 

as j — ► +oc and the result follows. □ 

We apply this result when / is a continuous function on X and f e = C' e '/ for 
every e G {0, 1}£. From Corollary 13. 8( we have that the averages in Corollary 15.21 
converge in L 2 {p) to the function Therefore: 
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Corollary 5.3. Let f be a continuous function on X. Then 



I n c^f( Xe )d Px (x) 



V k f(x) 

ee{o,i}J 

and the function T>kf is the uniform limit of the sequence 

i e n /.p* h »)- 

fei,...,hfc=Oee{0,l}J 

TTiws 2?^/ is a continuous function on X . 

In particular, the function X>fc/ is a geometric object: it does not depend on the 
transformation T on X. 

Corollary 5.4. If f is a smooth function on X , then T>kf is a smooth function on 
X. 

Proof. Let xo G X. Then, by Corollary 15.31 and part (jvj) of Theorem 15. 11 for every 
g G G we have 

V k f(g.x )= f [] C^f(g.x e )dp X0 (x) . 
ee{o,i}i 

Thus the function g i— ► T>kf(g.xo) is a smooth function on G and the result follows. 

□ 

Remark 5.1. Let x €z X. Since the measure p x is invariant under the transforma- 
tions T\ , it follows that the image of this measure under the projection x i— ► x e 
for every e G {0, l} fc is invariant under T and thus is equal to p. Therefore if f e , 
e G {0, 1}£, are continuous functions on X, the Holder inequality gives: 

f J] fe(Xe)d Px (x)\< J] ||/ e || i2fc - 1(M) • 

«e{o,i}J ee{o,i}£ 

By density we deduce that for every / € L 2 ~ 1 (p) the function D^f is continuous 
on X and that 



5.3. The dual norm. 



Definition 5.5. Let the space C(X) of continuous functions on X be endowed with 
the norm ||| • Since |||/|||fe < H/H^*^ for every / G C(X), the dual of this space 

can be identified with a subspace of L 2 ^ 2 ~^(p,). We call this space the dual space 
and denote it by C(X)* k . We write \h\t for the dual norm of a function h € C(X)£. 

In other words, a function h G L 2 /( 2 — -^(/i) belongs to the dual space C{X) k if 
there exists a constant C with 



(13) 



f hdp 



<C\ 



for every / G C(X) and |||/|||^ is the smallest constant C with this property. 

We note that the dual space and the dual norm ||| ■ |j£ are geometric, not dynam- 
ical, objects. 
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We give two methods to build functions in the dual space. Let / be a func- 
tion on X, belonging to L 2 (/i). By characterization (O of the dual function and 
inequality ([2|), we have that for every h S C(X), 



< 



h.Vkfdfi 

Thus Vkf belongs to the dual space and |||X>fc/|||fc < 



2"-l 

k 



On the other hand, 



P*/i;> J fv k fdv = 



mm = ill/ill^ - 1 



and we conclude that 
(14) 

We now show: 

Proposition 5.6. The dual space C(X)* k contains all smooth junctions on X. 
Proof. Let / be a smooth function on X and let X k * and $ be the set and the map 



defined in part (JTTJ) of Theorem 15. 11 

Then / o $ is a smooth function on X k * and there exists a smooth function F 
on XI whose restriction to X k * is equal to / o This function can be written as 

oo 

3=1 £S{0,1}^ 

where the functions fj tC , j > 1 and e G {0,1}J, are continuous functions on X 
satisfying 

e n ii/^h~ 

3 = 1 eS{0,l}£ 

For every continuous function h on X, we have 
/ h dfi\ - 



< 



El / M*o) II hj, e (x e )d^(x) 



h(x ).F(xJd^(x) 



3 = 1 



££{0,1}* 



<5zm^iik- n iMu 

i=i ee{oa}J 

oo 

<IWIU£ II ll^elloc- 
J=l eS{0,l}^ 

where the next to last inequality follows from @. The announced statement follows. 

□ 

A similar proof is used in |GT2j in the finite setting. 

The hypothesis of smoothness is too strong and could be replaced by weaker 
assumptions. It is probably sufficient to assume that / is Lipschitz with respect to 
some smooth metric on X. Computing the dual norm of /, or even bounding it in 
an explicit way seems to be difficult. The regularity of the map $ should play a 
role, but in order to define this, we would first need to choose a metric on X. 
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Proposition 5.7. The unit ball of C(X)\ is the closure in L 2k ^ 2> " ^(jjl) of the 
convex hull of the set 

{V k f:feC(X), |/| fc <l} . 

Proof. Let B be the set in the statement. By (QJJ), for / £ C(X) with |||/||| fe < 1, 
we have that T>kf belongs to the unit ball of C(X)t. Since this ball is closed in the 
norm of L? ^ 2 _1 )(/x), it contains B. 

On the other hand, let / be a nonzero function belonging to L 2 (/i) and let 
h = ll/llfe 1 -/. As the map T>k : L 2 ^ (/i) — » L 2 ^ 2 is continuous, by density 

we have that 25&/i £ 5. As j f.Vkhd/j, = \jf\\k, the Hahn-Banach Theorem gives 
the opposite inclusion. □ 

5.4. Direct theorem (upper bound). We now have assembled the ingredients 
to prove Theorem 12.131 As we have not yet defined the norm |||b|||£. of a smooth 
nilscqucncc b, we state this theorem in a modified version. 

Theorem (Modified Direct Theorem). Let a = (a n :n£Z) be a bounded sequence 
that satisfies property V(k) on the sequence of intervals I = (Ij : j > 1). Let 
(X,T,/i) be an ergodic (k — l)-step nilsystem, xo £ X, and f be a smooth function 
on X . Then 



lim sup 



-L\2 a nf(T n X ) 



< Hal 



i,fc 



Proof. 



5.4.1. We begin with the case that / = V k (f) for some continuous function <j> on X 
with = 1. 

By substituting a subsequence for I, we can assume that for every h = (hi, . . . ,h k ) £ 
Z fc , the averages on Ij of 

eS{0,l}J 

converge. 

Fix S > 0. By Corollary 15. 3( for every sufficiently large H we have that 



H-l 



- y 



< 5 



a n CW<t>(T n +*- h x )-a n f(T n x 

,h k =0 ee {o,l}J 

for every n £ Z and so 

hi,...,h k =0 1 3 1 nelj 



ee{o,i}i 

for every j > 1. Taking the limit as j 
for every sufficiently large , 



nG/ 3 



< 5 . 



-co along a subsequence, we have that 



limsup |averages I (a„/(T"a;o))| 

H-l 



<5- 



1 



lim averages! (a„ J] C |e| </)(T n+e ' 1 xo)) 



fti,...,/ijb=0 



ee{o,i}J 
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We conclude that 

limsup |averages I (a n /(T n iEo))| 



< 



H-l 

lim — -r V lim averages^ TT C^(j)(T n+t - h x )) . 

H—>+oo z — ' ■ L - L 

h 1 ,...,h k =o «e{o,i}J 

The existence of the limit for H — > +00 is given by Proposition ^. 31 Using Inequal- 
ity (|12[) and Corollary 13. Ill we have that the last quantity is bounded by 



||a||i, fc .||(0(T" a ;o):nGZ)|| I 2 , 



a I.*. 



|(2 fe -l)/2 fc 



a i.fc 



5.4.2. We now turn to the general case. We can assume that |||/|||£ < 1. 

Fix 5 > 0. By Proposition 15.61 we can write f — fi + f2, where fi is a convex 
combination of functions considered in the first part and H/all^/P*-!)^) < S. The 
contribution of f\ to the limsup of the averages is bounded by 1. 

For every j > 1, by the Holder inequality we have 

r^r £ a n f 2 (T n x ) 



< Hall 



^Ei^o)i 2fe /^ x( " J 



Since both / and /1 are continuous, so is / 2 . Therefore, by unique ergodicity of 
(X, T), the averages of \f2(T n Xo)\ 2 ^ 2 ~^ converge to the integral of the function 
|j?|2 /(2 -l) anc j we jj ave ^hat 

limsup I averages! (a ra /2(T n xo)) I < 5 ■ 
The result follows. □ 
5.5. The dual norm for smooth nilsequences. 

Corollary 5.8. Let (X, fi, T) and (Y, v, S) be ergodic (k — \)-step nilsystems, xo G 
X , y G Y , f be a smooth function on X , and g a smooth function on Y . If 
f(T n x ) = g(S n y ) for every neZ, then |||/|||* , = \\g\\* k . 



Proof. Fix S > 0. By definition of ||| 
X with 

|||/i| M , fe = 1 and 
By unique ergodicity of X, 

N-1 



/ h d/i 



there exists a continuous function h on 
8 . 



> 



f hd^jL 



N-1 



N 



lim V f(T n xo)h(T n x ) = lim V g(S n y )h(T n x ) 

— >+oo\ c — ' N— >+ool * — ' 



n=0 



Let I be the sequence of intervals (In = [0, N — 1] : N > 1). By Corollary 13.111 
the sequence (h(T n xo): n G Z) satisfies property V(k) on I and || (h(T n xo) : n G 
Z)||i,fe = l^llLfc = 1- By the Modified Direct Theorem, we have that 

N-1 



lim V g(S n y )h(T n x : 



n=0 



< III 9 III L 



and so |||/|||* ifc - 5 < |«,|* fc . 
announced equality. 



Exchanging the roles of / and <?, we obtain the 

□ 



Using this corollary, we define: 
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Definition 5.9. Let b be a (k — l)-step smooth nilsequence. We define |||b|||*. = 
I /I* k , where / is a smooth function on an ergodic (k — l)-step nilsystem (X, /i, T) 
and xq E X is chosen such that b n = f(T n xo) for every n. 



Using this definition, the Direct Theorem (Theorem 12. 13|) is a reformulation of 
the Modified Direct Theorem of Section [531 



( 



5.6. The case k = 2. Let X be a 1-step nilmanifold, that is, a compact abelian 
Lie group, and let / be a smooth function on X. Let X be the dual group of G. 
Then the Fourier series of / is 

fip) = f(x)x(x) , where ^ \f(x)\ < +°° • 
An easy computation using the definition gives 

i/i2 = (Ei/wi 4 ) 1/4 - 

Therefore we have 

If T is an ergodic translation on X, xq 6 X, and b is the sequence given by 
b n = f(T n xo) for every n, we recover the formula for |||b|||*. given in Section [231 and 
Proposition 12.121 

5.7. Some convergence results. 

Corollary 5.10. Let k > 2 be an integer, I = (Ij : j > 1) be a sequence of intervals 
whose lengths tend to infinity, and let a = (a n : n 6 Z) be a bounded sequence. 
Assume that for every S > 0, there exists a {k — \)-step nilsequence a' such that the 
sequence a — a' satisfies property V(k) on I and ||a — a'||i fe < S. Then for every 
(k — l)-step nilsequence b = (b n : n £ Z), the limit 

lim — a n b n 

exists. 

Proof. By density, we can restrict to the case that b is a smooth nilsequence. Let 
5 > and the nilsequence a' be as in the statement. Since the product sequence 
a'b is a nilsequence, its averages converge. By Theorem 12. 131 

limsup | averages! ((a„ - o! n )b n )\ < S\\b\H . 

It follows that the averages on Ij of a n b n form a Cauchy sequence. □ 

By the same argument, we have: 

Corollary 5.11. Let k > 2 be an integer and a = (a n : n G Z) be a bounded 
sequence. Assume that for every 6 > 0, there exists a (k — \)-step nilsequence a' 
such that \\a. — a'\\ij^ <5. Then for every (fc — 1) -step nilsequence b = (b n '■ n € Z), 
the averages of the sequence a n b n converge, meaning that the limit 



lim t^- V a„6„ 
j^+oo \IA ^ 
1 J 1 nelj 
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exists for all sequences of intervals I = (Ij : j > 1) whose lengths tend to infinity. 

This Corollary is the direct implication of Theorem 12.191 Propositions 12.201 
and 12.211 provide examples of sequences satisfying the hypothesis of this Corollary. 

6. The correspondence principle revisited and inverse theorems 

6.1. An extension of the correspondence principle. We recall that a topo- 
logical dynamical system (Y, S) is distal if for every y, y' S Y with y =/= y' , then 



where dy denotes a distance defining the topology of Y. 

Proposition 6.1. Let (X,T) be a topological dynamical system, ijGXa transitive 
point, and /i an invariant ergodic measure on X. Let (Y, S) be a distal topological 
dynamical system, v an invariant measure on Y , and tt : (X, /i, T) — > (Y, v, S) a 
measure theoretic factor map. 

Then there exist a point yo S Y and a sequence of intervals I = {Ij'. j > 1) 
whose lengths tend to infinity such that for every continuous function f on X and 
every continuous function g on Y , 



If the system (X, T) and the point xq are associated to a sequence as in Sec- 
tion H3] and if Y denotes the Kronecker factor of (X,fi,T), then the sequence of 
intervals I given by the Proposition plays the same role as the "Kronecker complete 
processes" of [BFW] . Our construction is (we hope) simpler and works in a more 
general setting: below we use it when Y is a nilsystem. 

Proof. We write dx{-, •) and dy(-, •) for distances on X and Y defining the topolo- 
gies of these spaces. 

6.1.1. Construction of an extension of X . Let B be the closed (in norm) subalgebra 
of L°°(/i) that is spanned by C(X) and the functions g o ir with g £ C(Y). This 
algebra is unitary, separable, and invariant under complex conjugation and under 
T. 

Let W be the Gelfand spectrum of this algebra. Since B is separable, W is a 
compact metrizable space. By definition, there exists an isometric isomorphism of 
algebras C{W) -> B. 

As in Section l4.1.1[ there exists a homeomorphism R: W — > W satisfying ^(/ o 
T) = f(f)oR for all functions / G C(W). 

The inclusion of C{X) in B induces a continuous surjective map p: W — > X 
satisfying / op = \&(/) for every continuous function / on X and we have that 
Top — p o R. Similarly, the map g t— > g o ir from C(Y) to B is an isometric 
homomorphism of algebras and thus induces a continuous surjective map q: W — > Y 
satisfying g o q = ^>(g o it) for all continuous functions g on Y. We have that 
Soq = qoR. So, p : (W, R) -> (X, T) and q : (W, R) -> (Y, S) are factor maps, in 
the topological sense. 



inf d Y {T n y,T n y') >0 




lim — 




UNIFORMITY SEMINORMS ON AND APPLICATIONS 



31 



The map / i— > J f dp is a positive linear form on the algebra B and thus there 
exists a unique probability measure ponW satisfying 

J fd(j, = J ^(f)dp for all functions / 6 B . 

Since f(/oT) = ^f(f) ° -R for all / G £> and /i is invariant under T, the measure 
p is invariant under R. Since = / op for all continuous functions / on X, we 
have that the image of p under p is equal to p. Therefore, p: (W, p, R) — > (X, T) 
is a measure theoretic factor map. Moreover, for every function / 6 B, 

J \*(f)\ 2 dp = J ^(\f\ 2 )dp = / \f\ 2 dp 

and the map Hf is an isometry from the space B endowed with the norm L 2 (p) 
into the space L 2 {p). Since C(X) is dense in B under the L 2 (p) norm and since 
$(/) = / op for / G C{X), we have that for all / e B, 

^K/) = f °P (p-almost everywhere). 

We claim that the map p: (W^p, i£) — ► (X,p,T) is an isomorphism between 
measure preserving systems. Indeed, the range of the map / i— > fop: L 2 (p) — > L 2 (p) 
is closed in L 2 (p) because this map is an isometry, and it contains ^>(B) — C(W) 
and thus it is equal to L 2 (p). In particular, (W, p, R) is ergodic. 

Finally, for every function g £ C(Y), we have that g o q = ^(g o7r) = g o tx o p 
(p-almost everywhere) and so q = n op (p-almost everywhere). 

In particular, the image of p under q is v. 

6.1.2. Construction of the sequence of intervals. Since p is ergodic under R, it 
admits a generic point w\. Recall that this means that for every / £ C(W), 

i 3 ^ r 

lim - T J2f(R n w 1 )= fdp. 

Set x\ = p(w\). Since Xq is a transitive point of X, we can choose as in Section l4.1.2l 
a sequence of integers (kj ; j > 1) such that 

(15) lim sup dx(T n x u T k ' +n x ) = . 

j^+oo 0<n<j 

Set y\ = q(wi). Let 77 be a point in the closure of the sequence (S kj : j > 1) in 
the Ellis semigroup {Ej of (Y,S). Since (Y,S) is distal, we have (see PQ, chapter 
5) that 77 is a bijection from Y onto itself. Pick yo G Y such that 77(1/0) = Vx- 
Thus passing, if necessary, to a subsequence of (kj : j > 1), which we also denote 
by (kj : j > 1), we have that T kj y converges to y\. Again replacing this sequence 
by a subsequence, we can assume that 

(16) lim sup d Y (S n yi,S k * +n y )=Q. 

For all j > 1, set = [fej-, fej + j — 1]. Let / be a continuous function on X and 
g a continuous function on Y. By (ITS]) and (fTB")) we have that 

lim sup |/(T n aci) - f(T k i +n x )\ = and 

3->+oo o<n<j 

lim sup |5(5 n j/i)- ff (5^ +n j/ )| =0 . 

0<n<j 
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Thus 

3-1 



j— +oo\|7,-| ^ .7 ^— i 

For each integer n 



/(T"a; 1 ) 5 (S'"y 1 ) - / o p{R n w{).g o q(R n Wl ) . 

Since is a generic point with respect to the measure p, the second average in l|T71 
converges to 



(/ °p)-(g°q)dp = J (/ op).(g on op) dp = J f.(gon)dp 

because q — tt o p (p-almost everywhere) and the image of p under p is p. □ 
6.2. Inverse results. 

Proposition 6.2. Let k > 2 be an integer, a. be a bounded sequence, and 5 > 0. 
Then there exists a sequence of intervals I = (Ij : j > 1) whose lengths tend to 
infinity and a (k — l)-step smooth nilsequence b such that 

(i) The sequence a satisfies property V{k) on I and laHi^ > ||a||jr(fc) — 5. 

(ii) The sequence a — b satisfies property V(k) on I and ||a — b||i,fc < 5. 

Proof. Let (X, T, xq) be the pointed dynamical system associated to the algebra 
spanned by the sequence a, as in Section 14.1.11 Let / be the continuous function 
on X defined by f(T n xo) — a n for every nel 

By Proposition 14.51 there exists an invariant ergodic measure p on X with 
> ll a ll(7(fc) — S. By Corollary 13.121 of the Structure Theorem there exist 
a (k — l)-step nilsystem (Y, S, v), a measure theoretic factor map it: (X,p,T) — > 
(Y, v, S), and a smooth function g on Y with |||/ — go nf^k < S. 

Recall that every nilsystem is distal. Now, let I and yo be given by Proposi- 
tion [ITl] and let b be the nilsequence given by b n — g{S n yo) for every neZ. 

The measure on X associated to I as in 14. 1.21 is equal to p. Thus the sequence 
a satisfies property V{k) on I and ||a||i,fc = l/lfl^fc > ||a||t/( A ) - S. To prove 
Proposition 16. 21 we are left with proving that the sequence a — b satisfies property 
V{k) on I and that ||a - b||i )fe < 5. 

For h = (hi, . . . , hk) G Z fc , we have 

II C^(a n+e . h -b n+e . h )= J] CM(f(T n +*- h x )-g(S n+e - h y )) 

eG{0,l} fc £S{0,l} fe 

E II C^f(T n+ <- h x Q ) C^g(S n+ <- h yo) . 

(A,B) partition of {0,l} fc eGA e£B 

By definition of I, the averages (with respect to n) of the above expression on this 
sequence of intervals converge to 

E / II C^f{T<- h x) J] C^g o „(T* h x) dp(x) 

/ II C^(f-gon)(T^ h x)dp(x). 
J . . — r n lit 



(A,B) partition of {0,l} fc cGA t£B 



e€{0,l}* 
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By definition, the averages (with respect to h £ Z rf ) of the first term converge 
to ||a — t> 1 1 A; and, by Corollary 13. 8( the averages of the last integral converge to 
1/ — go Tr\ifj,,k < 8 and we are done. □ 

We now prove the Inverse Theorem (Theorem 12. 16[) . We recall the statement 
here for convenience. 

Theorem. Let a = (a n : n £ Z) be a bounded sequence. Then for every 8 > 0. 
there exists a (k — l)-step smooth nilsequence b = (b n : n £ Z) such that 

^ M+N-l 

I b = 1 and lim sup — V" a n b n 

N^+oo MeZ /V I 



N 



> l|a||c/(fc) -8 



Proof. We can assume without loss that ||a||j/(j.) > 8. Let I and c be as in Proposi- 
tion !6.21 but with 8/3 instead of 8; we write c„ = g(S n y ) for neZ, where (Y, S, v) 
is an ergodic (k — l)-step nilsystem, y £ Y, and g is a smooth function on Y. We 

define h — |||g|||,r 2 +1 .T>kg and b to be the sequence given by b n — h(S n yo), and we 
check that the announced properties are satisfied. 

By Corollary 15. 4[ h is a smooth function and |||/i|||* fc = 1 by (fl4|) and thus 
= 1- We have 

lim averages! (c„6„) = lim averages I (g(S'™yo)^('S'"yo)) = J g.hdv = \\g\\k 

= ||c||i, fc >||a||i, fc -<y/3>||a||^ (fc) -25/3. 

On the other hand, by the Direct Theorem 12. 131 

limsup | averages! ((es„ - c n )b n )\ < ||a- c||i,fe ||b||^ < 8/3 

and we conclude that the liminf of the averages on I of a n b n is > ||a||j/(fc) — 8 and 
we are done. □ 

6.3. Proof of Theorem 12.191 We recall the statement for convenience. 

Theorem. For a bounded sequence a = [a n : n £ Z), the following are equivalent. 

(i) For every 8 > 0, the sequence a can be written as a' + a", where a' is a 
(k — l)-step nilsequence, and ||a"||j/(fc) < 8. 

(ii) For every {k — \^-step nilsequence c — (c n ; n £ Z), the averages of a n c n 
converge. 

We recall that property (JTTJ) means that the averages 

— ^2 a » c « 

converge for every sequence of intervals I = (Ij : n > 1) whose lengths tend to 
infinity. The common value of these limits is written lim averages (tt n c n ). 

Proof. (0) (O This implication is given by Corollary 15. Ill 

© =*► © 

Assume that the sequence a satisfies Jn| . Let b and I be as in Proposition ^. 21 but 
with 8/3 instead of 8. Define a' = b and we are left with showing that ||a— 1>|| cr(fc) < 
8. 
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Assume that this does not hold. By Theorem I2.16[ there exists a (fc — l)-step 
smooth nilsequence c and a sequence of intervals J whose lengths tend to infinity 
with 

I c I £ = 1 and |lim averagesj ((a„ — 6„)c„)| > 25/3 . 
Now, the sequence (b n c n ) is a product of two (fc — l)-step nilsequences and thus 
it is also a (fc — l)-step nilsequence and its averages converge. By hypothesis, the 
averages of the sequence (a„c„) converge, and thus the averages of the sequence 
(o„ — b n )c n converge. Since I and J are sequences of intervals whose lengths tend 
to infinity, 

|lim averagesj((a„ — 6 n )c„)| =|lim averages ((a„ — b n )c n )\ 

= lira averagesj ((a„ — fe„)c„)| > 26/3 . 
On the other hand, by the Direct Theorem (Theorem 12. 131) 

|lim averagesj ((a„ - b n )c n ) \ < \\a - b\\i, k \\c\H < 26/3 
and we have a contradiction. □ 

7. An application in ergodic theory 

7.1. Proof of Theorem 12.221 We now turn to the generalization of the Wiener- 
Wintner Ergodic Theorem, replacing the exponential sequence e(nt) by an arbitrary 
nilsequence. Throughout this Section, for each integer N > 1, we write In for the 
interval [0, N — 1] and we let I denote the sequence of intervals (In ■ N > 1). 

Let (X,n,T) be an ergodic system, be a bounded measurable function on X, 
and fix an integer fc > 2. We build a subset Xq of full measure of X on which the 
conclusion of the Theorem holds for every (fc — l)-step nilsequence b. 

For every integer r > 1, Corollary 13.121 of the Structure Theorem provides a 
(fc — l)-step nilsystem (Z r , v r , S r ), a factor map 7iy : X — > Z r and a continuous 
function f r on Z r such that 

\l<f> - f r ° iy|U < 7,1 ■ 

By Corollary 13. 101 there exists a subset E r of X with fi(E r ) — 1 such that for 
every x E E r , we have 

\\{<t>{T n x) - f r o Tr r (T n x) : n S Z)|| r , fe = |^ - f r o 7r r ||| fe < r" 1 . 

Note that we consider the map 7ry to be defined everywhere. For /i-almost every x, 
we have that f r o Tr r (T n x) — f r (S r l TT r (x)) for every n £ Z. Therefore, there exists 
a set E' r C X with fJ.(E' r ) = 1 such that 

\\{(j>{T n x) - f r (S?Tr r (x)); n e Z)\\i,k = \U - fr o n r \U < r' 1 
for every x 6 E' r . 

Set X = fX^=i^'r- F° r ever Y x 6 X , the sequence ((f)(T n x): n e Z) satisfies 
the hypothesis of Corollarv l5.10| completing the proof. □ 

7.1.1. Proof of Corollary \2.23[ Let (X, /z, T) be an ergodic system, be a bounded 
measurable function on X, and let Xq be the subset of X introduced in Theo- 
rem 12.221 Let x G Xq and p be a generalized polynomial. 

For every n G Z, let {p(n)} denote the fractional part of p(n). Then {p(-)} 
is a bounded generalized polynomial. In BLJ (Theorem A, (ii)), it is shown that 
there exist an ergodic nilsystem (Y, v, S), a point y e Y , and a Riemann integrable 
function / on Y with {p(n)} = f(S n y) for every n E Z. 
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For every S > 0, there exists a continuous function g on Y with ||/ — f= 8. 

The sequence (g(S n y): n e Z) is a nilsequence and thus by definition of Xq, the 
averages on I of (f>(T n x)g(S n y) converge. On the other hand, since the function 
|/ — g\ is Riemann integrable and (Y, S) is uniquely ergodic, we have that 



< IHIoolim averages! (\f(S n y)-g(S n y) |) = / |/- 5 |<^ < ||0||oo<5 • 

Therefore the averages on I of <f>(T n x){p(n)} — cf>(T n x)f(S n y) form a Cauchy 



We remark that for every n 6 Z, we have that e(p(n)) = e({p(n)}) — e(f(S n y)) 
and that the function e(/(-)) is Riemann integrable on Y. The same proof gives 



7.2. Examples. Similar methods can be used to show show that some explicit 
sequences satisfy the hypothesis (0) of Theorem 12.191 and thus are universally good 
for the convergence in norm of multiple ergodic averages. 

Proposition 7.1. Let (X,T) be a uniquely ergodic system with invariant measure 
ji and let k > 2 be an integer. Let (Z^, Hk,T) be the factor defined in the Structure 
Theorem (Theorem \3.4\ ) and assume that the factor map iTk '■ X — > Zk-i is con- 
tinuous. Let f be a Riemann integrable function on X and let x G X . Then the 
sequence (f(T n x): n G Z) satisfies hypothesis (jij of Theorem \2.19l 

Proof. Let a be the sequence (f(T n x) : n G Z) and let S > 0. We want to show that 
we can write a = a' + a" where a' us a (k — l)-step nilsequence and ||a"|| [/(/.) < 6. 

Let (Y, S, v), p: X — > Y, and /i be the (fc — l)-step nilsystem, the factor map, 
and the function on Y given by Corollary 13. 121 Recall that Zj. is the inverse limit 
(in both the topological and measure theoretical senses) of all factors of X which 
are (k — l)-step nilsystems [HK1] . Thus Y is a factor of Zk and the factor map 
q: Zk — > Y is continuous. Therefore the factor map p = q o ir^ mapping X — > Y is 
continuous. 

We define the sequences a' and a" by a' n = h o p(T n x) and a" = f(T n x) — ho 
p(T n x) for every n£Z. Then a' is a (fc — l)-step nilsequence. Since the function 
/iop is continuous, the function / — hop is Riemann integrable, and Corollarv l3.11l 
implies that ||a"||[/( fc ) = |||/ — h o p\\ k < S. □ 

We use this proposition to prove Proposition 12 . 20l on generalized polynomials. 

Proof of Proposition 1 2. 2(31 Let p be a generalized polynomial. For every ndZ, let 
{p(n)} denote the fractional part of p(n). We begin with the same argument as in 
the proof of Corollary 12.231 

There exists an integer £ > 1, an ergodic ^-step nilsystem (X = G/T, fi,T), a 
point x € X, and a Riemann integrable function / on X with {p(n)} — f(T n x) 
and e(p(n)) = e({p(n)}) — e{f{T n x)) for every n G Z. 

The system (X,/i,T) satisfies the hypotheses of Proposition 17.11 Indeed, for 
k > £ we have that Zk = X and for k < £, Zk is the quotient G/GkT of X. The 
result follows. □ 



limsup \a,vemge Sl (0(T n x)(f(S n y) - g(S n y)))\ 



sequence. 



the second claim of the corollary. 



□ 



We now prove Proposition 12.211 which states that the Thue-Morse sequence 
satisfies also the hypothesis of Theorem 12.191 



3(5 



BERNARD HOST AND BRYNA KRA 



Proof of Proposition ^. 2 1\ Let a = (o„ : n G Z) be the Thue-Morse sequence. We 
recall some of its properties (see |Ql). 

There exists a uniquely ergodic system (X, T, /z), a point xq <E X, and a contin- 
uous function <fi on X with a n = 4>(T n Xo) for every n G Z. Moreover, the factor 
map 7Ti : X — > Z\ on the Kronecker factor Zi of X is continuous. Finally, the map 
7r is two to one almost everywhere. 

For every integer k > 2, the factor Z/. of X, as given by the Structure Theorem, 
is an extension of Zk-i by a connected compact abelian group [HKlj . It follows 
that Zk = Z\ for every k. 

Therefore the hypotheses of Proposition 17. II are satisfied and we are done. □ 

7.3. Proof of Theorem 12.241 We now prove the generalization of the spectral 
theorem. Starting with an arbitrary measure preserving system ( Y, S 7 v) , by ergodic 
decomposition we can assume that (Y, S, v) is an ergodic system. 
We recall the following result from [HKlj (Theorem 12.1): 

Theorem. Let go, ■ ■ ■ , Qh-\ be measurable functions on (Y,S,v) with ||gi||oo < 1 
for i G {0, . . . , k — 1}. Then 

JV-l „fc-l 



< c min ffi fe-i 

ie{0,...,fc-l}'" 



limsup|-J- V I TT S™gi dv 

where c is a constant depending only on k. 

Proceeding as in iBHK] (proof of Corollary 4.5 from Theorem 4.4), we deduce: 

Corollary 7.2. Let go, . . . , gt-i be measurable functions on (Y, S, v) with ||gi||oo < 
1 fori G {0,...,fc- 1}. Then 

N-l „fc-l 



limsupl V| f TT S in gi dv 
jv-»+oo iv — i y 7, 



n=0 

We deduce: 



ie{o,...,fe-i} 



Corollary 7.3. Let fi, ■ ■• , fk be bounded functions on (Y, S, v) with ||/j[[oo < 1 /or 
i G {1, . . . , fc} and Zei a = (a„ :n£Z) 6e a sequence with ||a||oo < 1- Then 

JV-l 

(18) limsup I- £ a n JJ<5 in /i r2 , , < ^'^ X ' 2 mfc . 

> > c — ' ^ ^ L 2 {y) 2£{l,...,fc} 



AT— >+oo 



TV 

n=0 i=l 



Proof. Let £ G {1, . . . , fc} be such that ||/e||fe+i = a im ie{i,...,fc}|/i|IU+i and let Q be 
the limsup in the left hand side of (fTS")) . 

By the van der Corput Lemma (Appendix lA")) : 



M-l Af-1 



Q 2 < limsup i- V llimsup^ a^n+m / IT S in (fi.S im fi) dv 



m— n— 

By the Cauchy-Schwarz Inequality, 

M-l -, Af-1 



Q 4 < limsup -J- V limsup 1 V I / TT S in (fi.S im fi) d, 

1 M-l 1 AT— 1 „fc-l 

= limsup - £ limsup - ]T I / [] ^ m (7m^ (j+1) "V,+i) ^ 
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Applying Corollary O to the functions g z = f~^.S^ m f i+1 , we have that 

M-l -. kM-l 

Q 4 < c 2 limsup - J2 \Wl-S lm hf k < kc 2 limsup — £ |5.S"7*|2 

kM — 1 -i /oA:-l 



<fc c 2 (iimsu P — £ \ih-s m fm 



m=0 

by the Holder Inequality. By ([3]) , the last lim sup is actually a limit and is equal to 
l/^lfe+i an d we are done. □ 

We now return to the proof of Theorem 12.241 We assume that a = (a n : n £ Z) 
is a bounded sequence such that the averages 

n=0 

converge as N — + +00 for every fc-step nilsequence b = (b n : n 6 Z). We assume that 
(Y, 5, Z/) is an ergodic system and fi, ■ ■ ■ , ft € L°°(v). We show the convergence of 
the averages 

JV-l 



1 



V a n S n h...S kn f k 



N ^ 

n=0 

in L 2 (^). 

Let Zk be the fc-th factor of (Y, 5, f) , as given by the Structure Theorem. If 
for some i S {l,...,fc} we have E(/j | Z^) = 0, then |||/i|||fc+i = 0. Then by 
Corollarv l7.3i the above averages converge to zero in L 2 (y). We say that the factor 
Zk is characteristic for the convergence of these averages. 

Therefore, in order to prove the convergence of these averages, for arbitrary 
bounded functions, it suffices to prove the convergence when the functions are 
measurable with respect to the factor Z^. 

By the Structure Theorem, Z^ is an inverse limit of k step nilsystem. Thus by 
density, we can assume that the functions fi are measurable with respect to a fc-step 
nilsystem (Z, S) which is a factor of (Y, S, v). By density again, we are reduced to 
the case that ( Y, v, S) is a fc-step nilsystem and that the functions fi,...,fk are 
continuous. 

But in this case, for every y £ Y the sequence 

(h(S n y).f 2 (S 2n y).--- .f k (S kn y):neZ) 
is a fc-step nilsequence and by hypothesis, the averages 
1 W_1 

- £ <>n h{S n y).f 2 {S 2n y). ■ ■ ■ .f k (S kn y) 

n=0 

converge for every y £ Y. □ 



Appendix A. The van der Corput Lemma 
We state the van der Corput Lemma, as used in our set up (see |KN| ): 
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van der Corput's Lemma. Let a = (a n :m£Z) be a sequence with \a n \ < 1 for 
all n G Z and let I be an interval in Z. Then for every integer H > 1, we have 



r^E a «l 



< 



AH 



H 

|E 

h=-H 



H-\h\ 1 

~h 2 ~ W\ 



rh^r 



Appendix B. Parallelepipeds in nilmanifolds 

We explain the cubic structure associated to a nilmanifold. In the literature, 
there are (at least) two presentations of these objections, in |HK1] and in Appendix 
E of [GT2] . The results proved in these papers are often recalled here without proof, 
but we need a bit more than just those results. We use the notation of [HK1 . The 
group that we denote by G^_ 1 is the same as the group HP fe of |GT2] , 

The fc's in index and exponent that occur everywhere are cumbersome but nec- 
essary as we use an induction at some point. 

B.l. Algebraic preliminaries. We begin with some algebraic constructions in- 
volving "cubes." Let G be a group and k > 1 be an integer. 

B.l.l. Two constructions of the "side group". We use the notation of Section [3.21 
We write = (0, 0, . . . , 0) £ {0, l} fc and 1 = (1, 1, . . . , 1) G {0, l} k . 

As before, if X is a set, X^ = X 2 and points of X^ are written as x = 
(cc e : e <E {0,l} fc ). For x e X, x^ G 1^ is the element (x,x,...,x), with x 
repeated 2 fe times. If /: X — > Y is a map, f^ : X^ ~ > denotes the diagonal 
map: (f{x)) e = f{x £ ) for all e G {0, l} fc . 

For g G G and 1 < i < k, g\ k] = ((g\ h] ) e ■ e G {0, l} k ) is given by: 

/ [fek _ fg if e { = 1 
[9i )e ~\l ife,=0. 

(Note that we mean e = (ei, . . . , e/c)-) G^_ x is the subgroup of G^ spanned by 

{gW : g g G} U {gf ] : 1 < i < k, g G G} . 

The same group was also introduced in [GT2J , but with a different definition and 
notation. We recall their presentation, but in our notation, substituting "upper 
faces" for "lower faces" for coherence. We start with some notation. 

It is convenient to view {0, l} fc as the set of vertices of the unit Euclidean cube. 
If J is a subset of {1, ... , k} and -q G {0, 1} J , the set 

a = {e e {0, l} fc : e t = m for all i G J} 

is called & face of {0, l} fc . The dimension of a is dim(a) = k—\J\. If all coordinates 
of 7/ arc equal to 1, then this face is called an upper face. In particular, ctQ — {0, l} k 
is the unique upper face of dimension fc, corresponding to J = 0; {1} is the unique 
upper face of dimension zero, corresponding to J = {1, . . . , k}. The k upper faces 
of dimension k — 1 are a; = {e G {0, l} k : ei = 1} for 1 < i < k. Let ao, ai, . . . , a 2 k 
be an enumeration of all of the upper faces such that ao, . . . , otk are as above and 
dim(ai) is a decreasing sequence; in particular, a 2 fe = {1}. 
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If a is a face and g G G, we write g a = ((g<x )e : e 6 {0, l} fc ) for the element of 

G [k] 

given by: 

1 1 otherwise . 

In particular, the elements gf^ defined above can be written as ga} ■ 

In [GT2] . HP fc (G) is defined to be the set of elements g G G^ that can be 
written as 

(19) g = (ffi)W (0j)W . . . (ff 2 »)12* where ^ e G fc-dim(a < ) for every i G {1, . . . , k} . 

Here Go = G% = G; in all other places in the paper, we use Go to denote a different 
object (the connected component of the identity of G). 

Let us explain briefly why G k k }_ x and HP fe (G) are actually equal. By a direct 
computation, Green and Tao show that HP ,C (G) is a subgroup of GW; since it 
contains the generators of G^\, it contains this group. On the other hand, it is 
shown in [HK1] (section 5) that for every side a of dimension d and every g G 
Gk-dim(a), 9a belongs to G^ (and more precisely to (Gj s fe l 1 ) fc _ dim ( Q! )) and thus 
HP fc (G) C G l £_ v We have equality. 

\k] 

In the sequel we only use the notation G k _ 1 . Depending on the property to be 
proven, the first or second presentation is more convenient. 

B.1.2. Algebraic properties. We have: 

(i) Let r be a subgroup of G. If all coordinates of g belong to T except 
possibly go, then go G LGfc. 

(ii) In particular, if all coordinates of g G G^f i _ l are equal to 1 except possibly 
go, then g G G fe . 

The second statement is proved (in a perhaps concealed place) in [HK1] via 
induction on fc, and the first one is not stated explicitly but follows with a similar 
proof. Both statements follow easily from the second definition of G k _ x and the 
symmetry of this set, allowing us to substitute the coordinate g\ for go- 

We need two more groups for our proofs. In this appendix, we write 

H k = {g G Gjfli : go = 1} and Gf = {g^ : g G G} . 

(The first group is not defined in the papers.) Then Hk is clearly a normal subgroup 
of Gj^ and Gj*^ = H k .G l £ ] . Moreover, H k is the group spanned by the elements 

\k] \k] 

g\ for 1 < i < k and g G G; in the second presentation of G k _ 1 , it consists of 
elements that can be written as in (|19p with gx = 1. 
We have 

(iii) (H k ) 2 = H k n(G 2 )M. 

(iv) (Gi fc i 1 ) 2 = GL fe I 1 n(G 2 )M. 

Proof. We prove flu} . The inclusion (H k )2 C Hfc D (G2)' fe ' is obvious. 

Let a be a face of dimension < fc — 1 containing 1. Let g G G and /i G Gfc-d-i- 
We can chose a face f3 of dimension fc — 1 and a face 7 of dimension <i + 1 such that 
a — (3 n 7. We have 

ff W G £T* ; G and [ 5 ; = [ ff W ; ^]] . 
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Thus [g;h]r £ {H k ) 2 . Therefore, for any q £ Gk~d, we have that q a £ {Hk)2- 

Using this remark, we can show the inclusion Hk (~l (G 2 )[ fe ] C (-Hfe)2- Let g be 
in the first of these groups. We write g as in (fT9| with gi = 1. By the remark, all 

terms of the form (gj)aj with dim(aj) < k — 1 in the product belong to (i?fc) 2 and 
we are reduced to show that the product of the k remaining terms also belongs to 
this group. We remark that all coordinates of this product belong to G 2 . 

Let g a be one of these terms. Then a is an upper face of dimension k — 1 and 
it is immediate that there exists r\ £ {0, l} fe such that r\ belongs to a and does not 
belong to any other upper face of dimension k — 1. Therefore, g is the coordinate r\ 
of the product and g £ G 2 . It follows that g$ belongs to (i?fc) 2 and we are done. 

We now deduce (frv|) . Again, the inclusion (g[1 1 ) 2 C g[1 x n (G 2 ) [fc] is obvious. 
Let g £ Gjfli n (G 2 ) W . We write 3 = /i^q where h £ G and £ £ We have that 
go = ft and so h £ G 2 . Thus ftM £ (G 2 ) [fc] . Moreover, qeH k n {G 2 ) [k] and by the 
second part of the Lemma, q £ {Hk)2 C (G 2 ){ £ fc l 1 . □ 

B.2. Topological properties. Henceforth G is a r-step nilpotent Lie group, L is 
a discrete cocompact subgroup, and X = G/L. In applications r will be equal to 
k — 1 but the general case is used in an induction below. 
In pKT] and [GT2] . it is shown that 

(v) Gifli i s a dosed subgroup of G^ and hence is an r-step nilpotent Lie 
group. 

(vi) The group := L^ n G^_ x is a cocompact subgroup of gJ._ 1 . 
We do not reproduce the proof here. We define: 

For the moment we write v k for the Haar measure of 

The image of t/fe under the projection x 1— > xo is equal to the Haar measure \x of 
A. We have that: 

(vii) The group Ofc := Hk n rt fc l is cocompact in Hk- 

Proof. Every g £ Hk belongs to G^^ and thus is at a bounded distance from some 
7 £ Afc. Since go = 1, 70 is at a bounded distance from 1. Since T is discrete, 70 
belongs to a finite subset F of T. 

We have that 5 is at a bounded distance from ((7o)'^) _1 7, which belongs to 

G k k] _ 1 nH k = e k . a 

We define Wk = Hk/Ok ■ Then Wk is a (fc — l)-step nilmanifold, naturally 
included in Xk as a closed subset. 

For every g £ G we have that ^ belongs to G^]_ v We deduce that for every 
a; £ A, we have that := (x, x, . . . , x) belongs to Xk- 

For every x £ A, we write 

Wfc.z = {x £ A fe : x = x} . 

We show: 

(viii) Let x e X and g be a lift of x in G. Then W k , x = 
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Proof. Let x G Wk, x and h be a lift of z in G k _ 1 . Since xq = x, we have that 
/io = <77 for some 7 e T. Let g = (5 , ^) _1 /i(7' fc ') _1 - Then q G iffc and its image 
y in iJ/j satisfies g^y = x. We thus have that Wk, x C ff^-Wft and the opposite 
inclusion is obvious. □ 

B.3. Dynamical properties. Henceforth, we assume that X is endowed with the 
translation T by some t g G and that (X, T, /i) is ergodic. Recall that the same 
nilmanifold can be represented as a quotient in different ways. As usual we assume 
that G is spanned by the connected component Go of the identity and r. We claim 
that: 

(ix) (G^Ijo = {G )f_ v 

(x) gL^Ij is spanned by (G^l-Jo, r^, and the elements 7^ , 1 < i < k. 

(xi) iifc is spanned by (Hk)o and the elements r\ k \ 1 < i < k. 

Proof. By hypothesis and the first definition of G^l-^ this group is spanned by 
elements of the form for g G Go, gf^ for g £ Go and 1 < i < fc, r] fc ' for 
1 < z < k and . This proves (jxj). 

The commutator of two elements of the above type belongs to (G2)J £ 1 1 C 
(Go)Lli j because it follows from our assumption that G2 C Go- Then every element 
g of G { k k] _ x can be written as g = h(T^) n (r\ k] ) mi . . . (rf ] ) m * with ft G (G )jfl r 

If <? G (G| c 1 :L )o, then by looking at the coordinate of g we have that ft-o T ™ = go 
belongs to G . Thus r n G G . 

Let i G {1, . . . , k}. As in the proof of jm}, there exists rj G {0, l} k such that 
rf ] = r and rf ] = 1 for j ^ i We have that g n = /i^rf * and thus r m * G G . Thus 
( T W)m« G (Go)^l r This achieves the proof of fli}. 

Now assume that g G (-fffc)o- Then it belongs to (G^l 1 )o and we write it as 
above, g = h(r[ k] ) mi . . . (rf ] ) m " with h G (Go)^. We have that h = go = 1 and 
so h G Hk l~l (Go)^!]^ and this element belongs to {Hk)o- This proves Ipcijl. □ 

(xii) Xk is ergodic under the action of and T\ k \ 1 < i < k. 

(xiii) Wk is ergodic under the transformations T\ k \ I <i < k. 

Proof. Let Z be the compact abelian group G/TG2 and u be the image of r in Z . 
Since T is ergodic, the translation by a on Z is ergodic. 

By (HE} and (any) definition of Gjfl l5 the quotient G{. fe l 1 /(G^ 1 _ 1 ) 2 A fc can be 
identified with the subgroup Z k k }_ x of Z^. This group consists of the points z of 
ZW which can be written as 

k 

z=(ul[vr:ee{0,l} k ) 

for some u, Hi, . . . , Vk G The transformations induced on this group by the 
transformations T^l and T- k \ 1 < i < k, are the translations by cr^ and crj fe '. In 

\k] 

the above parametrization of Z k _ v these transformations correspond to the map 
hh au and to the maps Vi 1— > cm;, respectively. 
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Since the translation by a on Z is ergodic, it follows easily that Z k _ x is ergodic 
under the translations by erM and a\ k \ By JIx| and Theorem 13.61 Xk is ergodic 
under the action of and xf ] , 1 < i < k. 

The second statement is proved in the same way. □ 

We show: 

(xiv) The Haar measure v% of Xk is equal to the measure /j,^ defined in [HK1] 
and described in Section [3721 



This result is proved in [HKlj , but the context is so different from the present one 
that we prefer to give a complete proof here. 

[21 

Proof. We use induction on fc. By definition, G\ = G x G and so X\ ~ X x X 
and V\ = fx x n, which is equal to the measure fix of [HKlj . 

Assume that the announced property holds up to fc — 1 for some fc > 1. In order 
to show the property for fc, it suffices to show that when / e , e £ {0,1} , are 2 fe 
continuous functions on X , we have that the function F defined on X^ by 

f(x)= n /«w 

£6{0,l} fc 

has the same integral under the measures /J-^ and Vk- 

For every x £ X, the point = (x,x,...,x) belongs to X k . Since (X k ,T^,T[ k] , . . . ,Tf ] ) 
is uniquely ergodic with invariant measure Uk, we have that 

F(x) dvk(x) 

L-l M-l N-l 

= lim ±Y( lim — ^ Y f lim TT fe(T n+ ^ m+e * l x))) 

£=0 mi,...m fc _i=0 n=0 e g{o,l} fe 

where m = {mi, . . . ,rrik-i) and e-m = eimi + . . , + enmn. By unique ergodicity 
of (X, T, fi) , this is equal to 

-. L-l M-l . 

^lEL^M^T E n f e {T e - m+eki x)dfi{x) 

t=0 mi,...m fc _ 1 =0 J e g{o,l} fc 

We write each e £ {0, l}* in the form 77O or ?/l with rj £ {0, 1} and this 
expression can be rewritten as 



1 L-l M-l 

jsloiEL^-sft e y n (ao.t%)(t-,) . 

fcO mi,...,m fc _i=0- / , )G {o : l}fc-i 

By unique ergodicity of X^-i under the transformations T^ k ~^ and T- k l \ 1 < i < 
k — 1, and proceeding as above, we have that this expression is equal to 

1 M_1 /• 

lim 7 V TT (friO-T e fr,i)(x r ,)di'k-i(x) . 

L— >-+oo lv ^ — ' / 

£=0 J ^{0,1}*-! 
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By the induction hypothesis, the integral remains unchanged when the measure 
^[fc-i] j g su b s tituted for Uk-i- We rewrite this expression as 

1 r 

(20) lim - V / F . Fx o (Tt*- 1 !)' V 1 " 1 ! 

where 

^ofe) = n fvoM and = n /')i( a; ')) ■ 

r;e{0,l} fc - 1 ?7e{0,l} fc ~ 1 

Let J denotes the T^ 1 ! -invariant cr-algebra of the measure p\ k 1 J . The limit (|20p 
is equal to 

"e(F |I)E(Fi IT)^- 1 ! . 
By the inductive definition of the measure pJ- k ' in [HKlj (section 3), this is equal to 

F (x n0 : T) E {%\f- x )F x {x v v. V G {O,!}^ 1 )^^ 
and the function in the integral is just the function i* 1 . □ 

Recall that the measure p} k ^ satisfies the inequality © of Section I3~2l This can 
probably be proved directly for the measure v% but does not seem obvious. 

B.4. The fibers. Recall that for every x E X, Wk. x = {x E : xo = x}. 

(xv) For every Wfc, x is uniquely ergodic under the transformations Tj fc ', 
1 < i < k. 

We write p x for the invariant measure of Wk, x - 

(xvi) For every x E X and ft. E G, is the image of p x under the translation 
by hW. 

Proof. Let 5 be a lift of x in G and f = grg^ 1 . 

For 1 < i < fc , we have that f| fc ' = g ^ rj^ (g ^ ) ~ 1 and all these elements commute 
and belong to Hk- For 1 < i < k, let TV be the translation by t\ . 

We first show that the nilsystem (Wk,T^ k \ . . . , fj®) is uniquely ergodic. For each 
h ff* ] (rf belongs to i? fc n(G 2 ) [fel and thus to (Jf fc ) 2 by dm]). Therefore, ff 1 and 



1. 



rf^ have the same projection on the compact abelian group H^/ (Hk)2- By (jxiiip . 
the action induced by rf^ , 1 < i < k on this group is ergodic. The criteria given 
by Theorem 13.61 and property (pel)) give the announced unique ergodicity. 

By (|viii[) . we have that g^.Wk — Wk,w Themapy i-> g[ fc l.j/ mapping (Wfc,T.p'', . . . 

to (Wk, x ,T\_ k \ . . . , Tp]) is an isomorphism of topological systems and thus the sec- 
ond of these system is uniquely ergodic. This proves (|xvj). 

We write p for the Haar measure of the nilmanifold — -fffe/©fe- Then p 
is the invariant measure of Wk and the above proof shows that for every g E G, 
the invariant measure of Wk,x is the image of p under translation by G^. This 
immediately implies (jxvil) . □ 



In fact, Wk,x can be given the structure of a nilmanifold, quotient of the group 

Hk by the discrete cocompact group g^Q(g^)^ 1 , and the transformations are 
translations on this nilmanifold. 
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B.5. The case that G is a (k — l)-step nilpotent. Henceforth we assume that 
G is a (k — l)-step nilpotent group. 
We show: 

(xvii) Let Xk* be the image of x i— > x^ of Xk under the projection x i— ► x^ 
mapping X^ to X 2 ~ 1 . There exists a smooth map Xk* — > X/. such 



Different proofs are given for the existence and continuity of <£> in |HK1| and [GT2] . 
The smoothness of $ can be easily deduced from these proofs, but this property is 
not stated in these papers. For completeness, we give a short complete proof. 

Proof. First we remark that the projection Xk — > Xk* is one to one. Indeed, let x 
and y be two points of Xk with the same projections. We lift them to two elements 
g and h of G' fe '. All the coordinates of hg~ x belong to T except the first one, and 
by Q this coordinate also belongs to TGk = T. Thus x = y. 

Therefore the projection Xk — > Xk* is a homeomorphism. By composing the 
reciprocal of this map with the projection x <— ► xo, we obtain a continuous map 
$ : Xfc* — > X satisfying (f2Tj) . We are left with showing that it is smooth. 

Let G* be the image of in G 2 - 1 under the map j h j , By ([u]), the 
projection — > G* is one to one. 

We check that G* is a closed subgroup ofG 2 - 1 . Let(g )be a sequence in G 
converging to some g . For each n, there exists j 0n € G with g = (g „, g ) £ 

G^lj and there exists 7 £ n G^lj at a bounded distance from g . All the 
coordinates of 7 , except 70, are for all n at a bounded distance from the unit. 
By passing to subsequences, we can assume that they do not depend on n. By (Q), 
7 does not depend on n. Therefore, q remains at a bounded distance from the 

— n r 1 —n 

unit and taking a subsequence we can assume that it converges to some g, which 

belongs to G k k }_ x by (jvj). Then the projection of g on G* is equal to g . Thus g 
belongs to G*. 

Now, the projection G k _ 1 — > G* is a smooth bijective homomorphism between 

Lie groups. Therefore it is a diffeomorphism. Since the projection G k _ 1 — > Xk has 
discrete kernel, it follows that the projection Xk — > Xk* is a diffeomorphism and 
thus that $ is smooth. □ 

We deduce: 

(xviii) I • life is a norm on C(X). 



Proof. It suffices to show that if / G C(X) satisfies \\f\\ k = 0, then / = 0. By 
Proposition ^. 3[ if f t , e G {0, are 2 k — 1 continuous functions on X, then 



that 



(21) 



X k 



{mxj,xj: xe X k *} ■ 




ee{o,i} fc -* 



By density j f(xo)F(x :t ) dii(x) = for every continuous function F on Xk* ■ Taking 
F = /o$ where $ is as in statement[n]of Theorem l5.il property (|2Tj) of this function 



UNIFORMITY SEMINORMS ON AND APPLICATIONS 



45 



gives 

= J f(x )f(Hx*))d fJ ,W(x) = J \f(x )\ 2 d^(x) = J \f(x)\ 2 d»(x) 
because the projection of on 1 is ^i. □ 
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